KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Institute AIFB
ww...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Outline
Motivatio...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Motivation
Semant...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Linked Data Princ...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Correspondence be...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Correspondence be...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Queries over Link...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Data warehousing ...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
DQP on Linked Dat...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Query Processing ...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Problem: Source S...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Keep index of pro...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Exploits correspo...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Combined descript...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Implementation
De...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Linking Open Data...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Linking Open Data...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Linking Open Data...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Linking Open Data...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Geonames Services
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Geonames Services
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Geonames Services...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
{"weatherObservat...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Linked Open Servi...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
LOS Weather Servi...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
LOS Geo Resources
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Resource-Based Li...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Interlinking Data...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Linked Data Servi...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Interlink LIDS an...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Query Answering u...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Experiment: Query...
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Linked * Services...
Upcoming SlideShare
Loading in...5
×

Linked Data and Services

1,528

Published on

Presentation of Linked Open Services and Linked Data Services at the PlanetData kick-off

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,528
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
39
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • collect the data from all known sources in advance
    preprocess the combined data
    store the results in a central database;
    queries are evaluated using the local database
    parse, normalise, and split the query into subqueries
    determine the sources containing results for subqueries
    and evaluate the subqueries against the sources directly

    Match with later architecture overview animation

    MAT:
    excellent query response times due to the large amount of preprocessing carried out during the load and indexing steps
    aggregated data is never current as collecting and indexing vast amounts of data is time-consuming
    from the viewpoint of a single requester with a particular query, there is a large amount of unnecessary data gathering and storage
    due to the replicated data storage the data providers have to give up their sole sovereignty on their data (e.g., they cannot restrict or log access any more since queries are answered against a copy of the data)
    DQP:

    system is more dynamic with up-to-date data
    new sources can be added easily without time lag for indexing and integrating the data
    the systems require less storage and processing resources at the query issuing site
    DQP systems cannot give strict guarantees about query performance since the integration system relies on a large number of potentially unreliable sources

    Source selection affects efficiency of query execution
    @Juergen: join processing as scan (DL) or in Jena (QTree)?
  • We want not materialise, but distributed Web Linked Lookups use web architecture (also different to distributed SPARQL)
    Traditional approaches assume a few data sources with full query processing capabilities (drei riessen bobbel, 100 kleine quellen)
    Linked Data: very large number of relatively small sources (kilobytes to megabytes)
    HTTP GET is sole operation
    We assume relatively stable source URIs
    Focus on tree-shaped conjunctive queries, full SPARQL can be layered on top
  • The upper right is standard application of Linked Data principles – if you request (state, in the request header, that you accept) HTML, you are redirected to a ‚page‘ URI; if you request RDF, you are redirected to a ‚data‘ URI (i.e. page/data is, in our implementation, appended to the end of the resource‘s URI). This is because the original URI actually identifies the airport but, since the airport is a real thing, not an information resource, you can‘t actually retrieve it in itself, only a related information resource.

    The bottom right is how we extend in LOS – under the same URI scheme you can ask for a computation relative to the resource by POSTing to a URI representing the weather under it (the airport).
  • Linked Data and Services

    1. 1. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Institute AIFB www.kit.edu Linked Data and Services Andreas Harth and Barry Norton
    2. 2. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Outline Motivation Linked Data Principles Query Processing over Linked Data Linked Data Services (LIDS) and Linked Open Services (LOS) Conclusion
    3. 3. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Motivation Semantic Web/Linked Data technologies are well-suited for data integration 30.01.2015 Data Integration Interactive Data Exploration Common Data Format/Access Protocol !?
    4. 4. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linked Data Principles* 1. Use URIs to name things; not only documents, but also people, locations, concepts, etc. 2. To enable agents (human users and machine agents alike) to look up those names, use HTTP URIs 3. When someone looks up a URI we provide useful information; with 'useful' in the strict sense we usually mean structured data in RDF. 4. Include links to other URIs allowing agents (machines and humans) to discover more things (*) http://www.w3.org/DesignIssues/LinkedData.html
    5. 5. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Correspondence between thing-URI and source-URI 5 User Agent Web Server http://www.polleres.net/foaf.rdf#me http://www.polleres.net/foaf.rdf HTTP GET RDF
    6. 6. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Correspondence between thing-URI and source-URI 6 User Agent Web Server http://dbpedia.org/resource/Gordon_Brown http://dbpedia.org/data/Gordon_Brown HTTP GET 303 HTTP GET RDF http://dbpedia.org/page/Gordon_Brown
    7. 7. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
    8. 8. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Queries over Linked Data SELECT ?f ?n WHERE { an:f#ah foaf:knows ?f. ?f foaf:name ?n. } ?f ?n SELECT ?x1 ?x2 WHERE { dblppub:HoganHP08 dc:creator ?a1. ?x1 owl:sameAs ?a1. ?x2 foaf:knows ?x1. }
    9. 9. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Data warehousing or materialisation-based approaches (MAT) Querying Data Across Sources 9 15.03.2010 CRAWL INDEX SERVE SELECT * FROM… R S Distributed query processing approaches (DQP) R S
    10. 10. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association DQP on Linked Data 10 15.03.2010 SELECT * FROM… R S R S SELECT ?s WHERE… TP TP TP TP HTTP GET HTTP GET ODBCODBC
    11. 11. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Query Processing Overview Andreas Harth Data Summaries for On-Demand Queries over Linked Data 11 15.03.2010 TP (an:f#ah foaf:knows ?f) SELECT ?f ?n WHERE { an:f#ah foaf:knows ?f. ?f foaf:name ?n. } TP (?f foaf:name ?n) ?f ?n http://danbri.org/foaf.rdf#danbri Dan Brickley Select source(s) Select source(s)
    12. 12. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Problem: Source Selection for Triple Patterns 12 15.03.2010 (?s ?p ?o) (#s ?p ?o) (?s #p ?o) (?s ?p #o) (#s #p ?o) (#s ?p #o) (?s #p #o) (#s #p #o) Given a triple pattern, which source can contribute bindings for the triple pattern?
    13. 13. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Keep index of properties and/or classes contained in sources (?s #p ?o), (?s rdf:type #o) Covers only queries containing schema-level elements Commonly used properties select potentially too many sources Schema-Level Indices [Stuckenschmidt et al. 2004] 13 15.03.2010 SELECT ?f ?n WHERE { an:f#ah foaf:knows ?f. ?f foaf:name ?n. } SELECT ?x1 ?x2 WHERE { dblppub:HoganHP08 dc:creator ?a1. ?x1 owl:sameAs ?a1. ?x2 foaf:knows ?x1. }
    14. 14. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Exploits correspondence between thing-URI and source-URI Linked Data sources (aka RDF files) return typically triples with a subject corresponding to the source Sometimes the sources return triples with object corresponding to the source (#s ?p ?o), (#s #p ?o), (#s #p #o) (?s ?p #o), (?s #p #o) Incomplete wrt. patterns but also wrt. to URI reuse across sources Limited parallelism, unclear how to schedule lookups Direct Lookup (DL) [Hartig et al. 2009] SELECT ?f ?n WHERE { an:f#ah foaf:knows ?f. ?f foaf:name ?n. } SELECT ?x1 ?x2 WHERE { dblppub:HoganHP08 dc:creator ?a1. ?x1 owl:sameAs ?a1. ?x2 foaf:knows ?x1. }
    15. 15. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Combined description of schema-level and instance-level Use approximation to reduce index size (incurs false positives) Possible to use entire query for source selection Parallel lookups since sources can be determined for the entire query (?s ?p ?o), (#s ?p ?o), (?s #p ?o), (?s ?p #o), (#s #p ?o), (#s ?p #o), (?s #p #o), (#s #p #o) and combinations of triple patterns Approximate Data Summaries 15 15.03.2010 SELECT ?f ?n WHERE { an:f#ah foaf:knows ?f. ?f foaf:name ?n. } SELECT ?x1 ?x2 WHERE { dblppub:HoganHP08 dc:creator ?a1. ?x1 owl:sameAs ?a1. ?x2 foaf:knows ?x1. }
    16. 16. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Implementation Deploy wrappers „in the cloud“ Google App Engine: hosting of Java and Python webapps on Google’s Cloud infrastructure Limited amount of processing time (6hrs/day) Single-threaded applications Suited for deploying wrappers e.g. http://twitter2foaf.appspot.com/ converts Twitter user data to RDF
    17. 17. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linking Open Data Cloud 2007
    18. 18. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linking Open Data Cloud 2008
    19. 19. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linking Open Data Cloud 2009
    20. 20. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linking Open Data Cloud 2010
    21. 21. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Geonames Services
    22. 22. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Geonames Services
    23. 23. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Geonames Services {"weatherObservation": {"clouds":"broken clouds", "weatherCondition":"drizzle", "observation":"LESO 251300Z 03007KT 340V040 CAVOK 23/15 Q1010", "windDirection":30, "ICAO":"LESO", ...
    24. 24. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association {"weatherObservation": {"clouds":"broken clouds", "weatherCondition":"drizzle", "observation":"LESO 251300Z 03007KT 340V040 CAVOK 23/15 Q1010", "windDirection":30, "ICAO":"LESO", ... Geonames Services
    25. 25. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linked Open Service Principles REST Principles 1. Application state and functionality is divided into resources 2. Every resource is uniquely addressable 3. All resources share a uniform interface: a) A constrained set of well-defined operations b) A constrained set of content types Linked Data Principles 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs. so that they can discover more things. Linked Open Service Principles 1. Describe services as LOD prosumers with input and output descriptions as SPARQL graph patterns 2. Communicate RDF by RESTful content negotiation 3. The output should make explicit its relation with the input
    26. 26. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association LOS Weather Service
    27. 27. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association LOS Geo Resources
    28. 28. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Resource-Based Linked Open Services GET Accept: text/html 303 REDIRECT /page GET Accept: application/rdf+xml (or text/n3) 303 REDIRECT /data LinkedDataLinkedService GET /weather Accept: application/rdf+xml (or text/n3) 200 <rdf:Description>
    29. 29. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Interlinking Data with Data from Services?
    30. 30. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linked Data Services We’d like to integrate data services with Linked Data 1. LIDS need to adhere to Linked Data principles We’d like to use data services in software programs 2. LIDS need machine-readable descriptions of input and output Compared to naïve approach: assign URI to service output Relationship between input and output is explicitly described Dynamicity is supported Multiple or no output resources can be linked to input
    31. 31. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Interlink LIDS and Linked Data Generate service URIs with input bindings, from evaluating : select Xi where Ti sameAs: binding for i
    32. 32. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Query Answering using LIDS and Linked Data Query execution resolves URIs => enlarges data set LIDS are interlinked Query is executed again on new data set Repeat until no new links or no new data Combine results
    33. 33. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Experiment: Query Answering Input: List of 562 (potential) universities from Facebook Graph API Output: Facebook fans and DBpedia student numbers for 104 universities PREFIX u: <http://openlids.org/universities.rdf#> SELECT ?n ?f ?s WHERE { u:list foaf:topic ?u . ?u foaf:name ?n . ?u og:fan_count ?f .?u d:numberOfStudents ?s }
    34. 34. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association Linked * Services and PlanetData Several areas seem likely to produce services: Stream, inc. Sensor, resources (latest values) Any others exposing dynamic resources Dynamic computations, inc. on-the-fly quality assessments Other areas seem likely to consider service technologies and move towards more service-like HTTP interactions Access control (OpenID, OAuth, etc.) Finally, remaining areas could serve to complement LIDS/LOS alignment Provenance
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×