Executing SPARQL Queries of the Web of Linked Data

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Executing SPARQL Queries of the Web of Linked Data - Presentation Transcript

    1. Executing SPARQL Queries over the Web of Linked Data Olaf Hartig* Christian Bizer˚ Johann-Christoph Freytag* *Humboldt-Universität zu Berlin ˚Freie Universität Berlin
    2. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. ● When someone looks up a URI, provide useful information. ● Include links to other URIs so that they can discover more things. Tim Berners-Lee, July 2006 My Movie DB Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    3. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. ● When someone looks up a URI, provide useful information. ● Include links to other URIs so that they can discover more things. Tim Berners-Lee, July 2006 http://mymovie.db/movie1342 http://mymovie.db/movie0362 http://mymovie.db/movie5112 My Movie DB http://mymovie.db/movie2449 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    4. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. http://m ● When someone looks up a ymovie URI, provide useful information. ? .d b/movie ● Include links to other URIs so that they can discover more 2449 things. Tim Berners-Lee, July 2006 http://mymovie.db/movie1342 http://mymovie.db/movie0362 http://mymovie.db/movie5112 My Movie DB http://mymovie.db/movie2449 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    5. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. http://m ● When someone looks up a ymovie URI, provide useful information. ? .d b/movie ● Include links to other URIs so that they can discover more 2449 things. Tim Berners-Lee, July 2006 http://mymovie.db/movie1342 http://mymovie.db/movie0362 http://mymovie.db/movie5112 My Movie DB http://mymovie.db/movie2449 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    6. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. http://m ● When someone looks up a ymovie URI, provide useful information. ? .d b/movie ● Include links to other URIs so that they can discover more 2449 things. Tim Berners-Lee, July 2006 http://mymovie.db/movie1342 http://mymovie.db/movie0362 http://mymovie.db/movie5112 My Movie DB http://mymovie.db/movie2449 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    7. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. http://m ● When someone looks up a ymovie URI, provide useful information. ? .d b/movie ● Include links to other URIs so that they can discover more 2449 things. Tim Berners-Lee, July 2006 http://mymovie.db/movie1342 http://mymovie.db/movie0362 http://geo.db/country21 http://geo.db/country7 http://mymovie.db/movie5112 My Movie DB http://geo.db/cityCJ http://geo.db/cityXA http://mymovie.db/movie2449 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    8. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. http://m ● When someone looks up a ymovie URI, provide useful information. ? .d b/movie ● Include links to other URIs so that they can discover more 2449 things. Tim Berners-Lee, July 2006 http://mymovie.db/movie1342 http://mymovie.db/movie0362 http://geo.db/country21 http://geo.db/country7 http://mymovie.db/movie5112 My Movie DB http://geo.db/cityCJ http://geo.db/cityXA http://mymovie.db/movie2449 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    9. ● The Web: a huge, globally distributed dataspace ● Querying this dataspace opens new possibilities: ● Aggregating data from different sources ● Integrating fragmentary information ● Achieving a more complete view Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    10. Traditional approach 1: data centralization ● Querying a collection of copies from all relevant datasets Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    11. Traditional approach 1: data centralization ● Querying a collection of copies from all relevant datasets ● Misses unknown or new sources ● Collection probably out of date ● Will it scale? Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    12. Traditional approach 2: federated query processing ● Querying a mediator which ? distributes subqueries to relevant sources and integrates the results ? ? ? Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    13. Traditional approach 2: federated query processing ● Querying a mediator which distributes subqueries to relevant sources and integrates the results ? ● Requires sources to provide a query service Requires information ? ● about the sources ? ? ● Misses unknown or new sources Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    14. Main drawback: You have to know the relevant data sources in advance. You restrict yourself to the selected sources. You do not tap the full potential of the Web ! Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    15. A novel approach: Link Traversal Based Query Execution Allows data sources to be discovered at runtime Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    16. Outline Part I Overview of Link Traversal based Query Execution Part II An Iterator based Implementation Approach Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    17. Main Idea ● Intertwine query evaluation with traversal of RDF links ● Alternately: ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set Queried data
    18. Main Idea ● Intertwine query evaluation with traversal of RDF links ● Alternately: ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    19. Main Idea ● Intertwine query evaluation with traversal of RDF links Alternately: htt ● p:/ /. Evaluate parts of the query on a ../m ? ● continuously augmented set of data ov ie2 44 ● Look up URIs in intermediate 9 solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    20. Main Idea ● Intertwine query evaluation with traversal of RDF links Alternately: htt ● p:/ /. Evaluate parts of the query on a ../m ? ● continuously augmented set of data ov ie2 44 ● Look up URIs in intermediate 9 solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    21. Main Idea ● Intertwine query evaluation with traversal of RDF links Alternately: htt ● p:/ /. Evaluate parts of the query on a ../m ? ● continuously augmented set of data ov ie2 44 ● Look up URIs in intermediate 9 solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    22. Main Idea ● Intertwine query evaluation with traversal of RDF links ● Alternately: ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    23. Main Idea ● Intertwine query evaluation with traversal of RDF links ● Alternately: ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set filmingLocation http://.../movie2449 http://geo.../Italy Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    24. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set filmingLocation http://.../movie2449 http://geo.../Italy Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    25. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a ? aly continuously augmented set of data ./I t .. g eo Look up URIs in intermediate :// ● p htt solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    26. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a ? aly continuously augmented set of data ./I t .. g eo Look up URIs in intermediate :// ● p htt solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    27. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a ? aly continuously augmented set of data ./I t .. g eo Look up URIs in intermediate :// ● p htt solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    28. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    29. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    30. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set tics http://stat.db/.../it statis http://geo.../Italy Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    31. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate ?loc ?stat solutions and add retrieved data http://geo.../Italy http://stats.db/../it to the queried data set tics http://stat.db/.../it statis http://geo.../Italy Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur
    32. In a Nutshell ● Link traversal based query execution: ● Evaluation on a continuously augmented dataset ● Discovery of potentially relevant data during execution ● Discovery driven by intermediate solutions ● Main advantage: ● No need to know all data sources in advance Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    33. Real-World Examples SELECT DISTINCT ?author ?phone WHERE { ?pub swc:isPartOf <http://data.semanticweb.org/conference/eswc/2009/proceedings> . ?pub swc:hasTopic ?topic . ?topic rdfs:label ?topicLabel . FILTER regex( str(?topicLabel), "ontology engineering", "i" ) . ?pub swrc:author ?author . { ?author owl:sameAs ?authorAlt } Return phone numbers of authors of ontology engineering papers UNION at ESWC'09. { ?authorAlt owl:sameAs ?author } ?authorAlt foaf:phone ?phone # of query results 2 } # of retrieved graphs 297 # of accessed servers 16 avg. execution time 1min 30sec Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    34. Outline Part I Overview of Link Traversal based Query Execution Part II An Iterator based Implementation Approach ➢ Introduction to the Iterator Paradigm ➢ Application to Link Traversal based Query Execution ➢ URI Prefetching ➢ Extension to the Iterator Paradigm ➢ Evaluation Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    35. Iterator based Query Execution ● Iterator: ● implements an operation ● is a group of functions: OPEN, GETNEXT, CLOSE Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    36. Iterator based Query Execution ● Iterator: ● implements an operation I1 ● is a group of functions: OPEN, GETNEXT, CLOSE I2 ● Query execution uses a chain of iterators I3 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    37. Iterator based Query Execution ● Iterator: ● implements an operation http://.../movie2449 I1 ● is a group of functions: filmin gLoc ation ?loc OPEN, GETNEXT, CLOSE stati stics ?stat I2 ?loc ● Query execution uses a chain of iterators ?stat I3 Each iterator responsible unem p ● _rate ?ur for a single triple pattern http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    38. Iterator based Query Execution ?c ?cStats http://geo.db/country/US http://stats.example.org/USstatistics Results from Ii -1 http://geo.db/country/IT http://stats.example.org/ITstatistics μcur http://geo.db/country/IT http://stats.db/example/It http://example.db/ctry/DE http://stats.example.org/Germany Ii for tpi 1. Substitute tpcur = μcur [ tpi ] 2. Find matching triples match(tpcur ) in queried data set 3. Create solution μ' for each t in match(tpcur ) 4. Return each μcur U μ' as a result Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    39. Iterator based Query Execution ?c ?cStats http://geo.db/country/US http://stats.example.org/USstatistics Results from Ii -1 http://geo.db/country/IT http://stats.example.org/ITstatistics μcur http://geo.db/country/IT http://stats.db/example/It http://example.db/ctry/DE http://stats.example.org/Germany tpi = ( ?loc ex:stats ?s ) Ii for tpi μcur = { ?p → http://ex... , ?loc → http://geo... } Example 1. Substitute tpcur = μcur [ tpi ] 2. Find matching triples match(tpcur ) in queried data set 3. Create solution μ' for each t in match(tpcur ) 4. Return each μcur U μ' as a result Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    40. Iterator based Query Execution ?c ?cStats http://geo.db/country/US http://stats.example.org/USstatistics Results from Ii -1 http://geo.db/country/IT http://stats.example.org/ITstatistics μcur http://geo.db/country/IT http://stats.db/example/It http://example.db/ctry/DE http://stats.example.org/Germany tpi = ( ?loc ex:stats ?s ) Ii for tpi μcur = { ?p → http://ex... , ?loc → http://geo... } Example 1. Substitute tpcur = μcur [ tpi ] tpcur = ( http://geo... ex:stats ?s ) 2. Find matching triples match(tpcur ) in queried data set 3. Create solution μ' for each t in match(tpcur ) 4. Return each μcur U μ' as a result Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    41. Iterator based Query Execution ?c ?cStats http://geo.db/country/US http://stats.example.org/USstatistics Results from Ii -1 http://geo.db/country/IT http://stats.example.org/ITstatistics μcur http://geo.db/country/IT http://stats.db/example/It http://example.db/ctry/DE http://stats.example.org/Germany tpi = ( ?loc ex:stats ?s ) Ii for tpi μcur = { ?p → http://ex... , ?loc → http://geo... } Example 1. Substitute tpcur = μcur [ tpi ] tpcur = ( http://geo... ex:stats ?s ) 2. Find matching triples match(tpcur ) in queried data set (http://geo... ex:stats http://db...), (http://geo... ex:stats http://ex...) 3. Create solution μ' for each t in match(tpcur ) 4. Return each μcur U μ' as a result Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    42. Iterator based Query Execution ?c ?cStats http://geo.db/country/US http://stats.example.org/USstatistics Results from Ii -1 http://geo.db/country/IT http://stats.example.org/ITstatistics μcur http://geo.db/country/IT http://stats.db/example/It http://example.db/ctry/DE http://stats.example.org/Germany tpi = ( ?loc ex:stats ?s ) Ii for tpi μcur = { ?p → http://ex... , ?loc → http://geo... } Example 1. Substitute tpcur = μcur [ tpi ] tpcur = ( http://geo... ex:stats ?s ) 2. Find matching triples match(tpcur ) in queried data set (http://geo... ex:stats http://db...), (http://geo... ex:stats http://ex...) 3. Create solution μ' for each t in match(tpcur ) μ' = { ?s → http://db... } 4. Return each μcur U μ' as a result Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    43. Iterator based Query Execution ?c ?cStats http://geo.db/country/US http://stats.example.org/USstatistics Results from Ii -1 http://geo.db/country/IT http://stats.example.org/ITstatistics μcur http://geo.db/country/IT http://stats.db/example/It http://example.db/ctry/DE http://stats.example.org/Germany tpi = ( ?loc ex:stats ?s ) Ii for tpi μcur = { ?p → http://ex... , ?loc → http://geo... } Example 1. Substitute tpcur = μcur [ tpi ] tpcur = ( http://geo... ex:stats ?s ) 2. Find matching triples match(tpcur ) in queried data set (http://geo... ex:stats http://db...), (http://geo... ex:stats http://ex...) 3. Create solution μ' for each t in match(tpcur ) μ' = { ?s → http://db... } 4. Return each μcur U μ' as a result { ?p → http://ex... , ?loc → http://geo.db/... , ?s → http://db... } Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    44. Iterator based Query Execution ● Results of Ii are solutions for tp1 , … , tpi Ii-1 for tpi-1 Ii for tpi Ii+1 for tpi+1 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    45. Outline Part I Overview of Link Traversal based Query Execution Part II An Iterator based Implementation Approach ➢ Introduction to the Iterator Paradigm ➢ Application to Link Traversal based Query Execution ➢ URI Prefetching ➢ Extension to the Iterator Paradigm ➢ Evaluation Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    46. Application to Link Traversal ● The queried data set grows Ii-1 for tpi-1 Ii for tpi Ii+1 for tpi+1 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    47. Application to Link Traversal ● The queried data set grows ● Look-up Requirement: Ii-1 for tpi-1 Do not evaluate tpcur until the queried data set contains all data that can be retrieved from Ii for tpi all URIs in tpcur Ii+1 for tpi+1 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    48. Application to Link Traversal ?c ?cStats http://geo.db/country/US http://stats.example.org/USstatistics Results from Ii -1 http://geo.db/country/IT http://stats.example.org/ITstatistics μcur http://geo.db/country/IT http://stats.db/example/It http://example.db/ctry/DE http://stats.example.org/Germany Ii for tpi 1. Substitute tpcur = μcur [ tpi ] 2. Ensure look-up requirement for tpcur 3. Find matching triples match(tpcur ) in queried data set 4. Create solution μ' for each t in match(tpcur ) 5. Return each μcur U μ' as a result Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    49. Application to Link Traversal ?c ?cStats http://geo.db/country/US http://stats.example.org/USstatistics Results from Ii -1 http://geo.db/country/IT http://stats.example.org/ITstatistics μcur http://geo.db/country/IT http://stats.db/example/It http://example.db/ctry/DE http://stats.example.org/Germany Ii for tpi 1. Substitute tpcur = μcur [ tpi ] 2. Ensure look-up requirement for tpcur 3. Find matching triples match(tpcur ) in queried data set 4. Create solution μ' for each t in match(tpcur ) 5. Return each μcur U μ' as a result Initiate look-ups and wait Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    50. Application to Link Traversal ?c ?cStats http://geo.db/country/US http://stats.example.org/USstatistics Results from Ii -1 http://geo.db/country/IT http://stats.example.org/ITstatistics μcur http://geo.db/country/IT http://stats.db/example/It http://example.db/ctry/DE http://stats.example.org/Germany Ii for tpi 1. Substitute tpcur = μcur [ tpi ] 2. Ensure look-up requirement for tpcur 3. Find matching triples match(tpcur ) in queried data set 4. Create solution μ' for each t in match(tpcur ) 5. Return each μcur U μ' as a result Initiate look-ups and wait Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    51. Blocked Query Execution ● Waiting for URI look-ups blocks query execution Ii-1 for tpi-1 Ii for tpi Ii+1 for tpi+1 Initiate look-ups and wait Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    52. Outline Part I Overview of Link Traversal based Query Execution Part II An Iterator based Implementation Approach ➢ Introduction to the Iterator Paradigm ➢ Application to Link Traversal based Query Execution ➢ URI Prefetching ➢ Extension to the Iterator Paradigm ➢ Evaluation Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    53. URI Prefetching ● Waiting for URI look-ups blocks query execution ● URI prefetching: when a URI Ii-1 for tpi-1 is bound to a variable initiate look-up in the background Initiate look-up Ii for tpi Ii+1 for tpi+1 Ensure look-up is finished Initiate look-ups and wait Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    54. URI Prefetching ?c ?cStats http://geo.db/country/US http://stats.example.org/USstatistics Results from Ii -1 http://geo.db/country/IT http://stats.example.org/ITstatistics μcur http://geo.db/country/IT http://stats.db/example/It http://example.db/ctry/DE http://stats.example.org/Germany Ii for tpi 1. Substitute tpcur = μcur [ tpi ] 2. Ensure look-up requirement for tpcur 3. Find matching triples match(tpcur ) in queried data set 4. Create solution μ' for each t in match(tpcur ) 5. Initiate parallel look-up for each new URI in μ' 6. Return each μcur U μ' as a result Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    55. URI Prefetching Ii-1 for tpi-1 Initiate look-up Ii for tpi Ii+1 for tpi+1 Ensure look-up is finished Initiate look-ups and wait Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    56. URI Prefetching Ii-1 for tpi-1 Initiate look-up Ii for tpi Ii+1 for tpi+1 Wait until look-up is finished Initiate look-ups and wait Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    57. URI Prefetching ● Even with URI prefetching query execution may block Ii-1 for tpi-1 Ii for tpi Ii+1 for tpi+1 Wait until look-up is finished Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    58. URI Prefetching ● Even with URI prefetching query execution may block Ii-1 for tpi-1 Ii for tpi ● Possible solutions: ● Program parallelism ● Asynchronous pipeline Ii+1 for tpi+1 ● Drawback: requires major Wait until look-up rewrite of existing is finished query engines Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    59. Outline Part I Overview of Link Traversal based Query Execution Part II An Iterator based Implementation Approach ➢ Introduction to the Iterator Paradigm ➢ Application to Link Traversal based Query Execution ➢ URI Prefetching ➢ Extension to the Iterator Paradigm ➢ Evaluation Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    60. Postponing Iterator ● Enabled by an extension of the iterator paradigm: ● New function POSTPONE: take most recently provided result back ● Adjusted GETNEXT: either return the next result or return a formerly postponed result ● POSTPONE allows to temporarily reject input solution μcur Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    61. Postponing Iterator ?c ?cStats http://geo.db/country/US http://stats.example.org/USstatistics Results from Ii -1 http://geo.db/country/IT http://stats.example.org/ITstatistics μcur http://geo.db/country/IT http://stats.db/example/It http://example.db/ctry/DE http://stats.example.org/Germany Ii for tpi 1. Substitute tpcur = μcur [ tpi ] 2. POSTPONE μcur if look-up requirement doesn't hold for tpcur 3. Find matching triples match(tpcur ) in queried data set 4. Create solution μ' for each t in match(tpcur ) 5. Initiate parallel look-up for each new URI in μ' 6. Return each μcur U μ' as a result Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    62. Outline Part I Overview of Link Traversal based Query Execution Part II An Iterator based Implementation Approach ➢ Introduction to the Iterator Paradigm ➢ Application to Link Traversal based Query Execution ➢ URI Prefetching ➢ Extension to the Iterator Paradigm ➢ Evaluation Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    63. Evaluation ● Implementation: Semantic Web Client Library (SWClLib) http://www4.wiwiss.fu-berlin.de/bizer/ng4j/semwebclient/ ● Berlin SPARQL Benchmark (BSBM) ● Simulates e-commerce scenario ● Mix of 12 SPARQL queries ● Generates datasets of different sizes (scaling factor) ● Simulation of the Web of Linked Data ● Linked Data server publishes BSBM datasets ● Experiment ● Adjusted BSBM queries link to the simulation server ● Execute query mix with SWClLib Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    64. Evaluation 250 w/o prefetching w/ prefetching avg. execution time per query mix in seconds non-blocking + 200 prefetching all data retrieved in advance 150 100 50 scal.factor # of triples # of entities 10 4,971 613 20 8,485 928 30 11,999 1,245 0 40 16,918 1,845 10 20 30 40 50 60 BSBM scaling factor 50 22,616 2,599 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data 60 26,108 2,914
    65. Take-away Summary ● Novel query execution approach for the Web of Data: ● Utilizes the characteristics of the Web ● Traverses RDF links during query execution ● Discovery of new data sources ● No need to know all data sources in advance ● Implementation approach: ● Iterator based execution with URI Prefetching ● Extension of the iterator paradigm (POSTPONE) ● New research challenges: ● Improving result completeness ● Investigating suitable caching strategies Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    66. Try it! ● SQUIN http://squin.org ● Provides SWClLib functionality as a Web service ● Accessible like a SPARQL endpoint ● Public SQUIN service at http://squin.informatik.hu-berlin.de/SQUIN/ Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
    67. These slides have been created by Olaf Hartig http://olafhartig.de This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License (http://creativecommons.org/licenses/by-sa/3.0/) Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data

    + Olaf HartigOlaf Hartig, 4 weeks ago

    custom

    319 views, 0 favs, 0 embeds more stats

    With these slides I presented my paper at the Inter more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 319
      • 319 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 15
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories