Institute of Information Systems & Information Management Scripting User Contributed Interlinking   Michael Hausenblas , W...
Agenda <ul><li>Linked Data 101 </li></ul><ul><li>A first step in UCI –  http://riese.joanneum.at   </li></ul><ul><li>Towar...
Linked Data: Principles <ul><li>Items should be identified using URI references [ URIrefs ] (and:  don’t use  bNodes ) </l...
Linked Data: Datasets (2008) By courtesy of Richard Cyganiak,  http://richard.cyganiak.de/2007/10/lod/
Linked Data: Issues <ul><li>Building </li></ul><ul><ul><li>RDFising process (schema, mapping)  </li></ul></ul><ul><ul><li>...
A first step in UCI - riese http://riese.joanneum.at
riese: A first step in UCI <ul><li>riese, the ‘ R DFizing and  I nterlinking the  E uro S tat Dataset  E ffort’ aims to of...
riese: architecture
riese: inside <ul><li>Server </li></ul><ul><ul><li>Apache 2.2 </li></ul></ul><ul><ul><li>SWI- Prolog </li></ul></ul><ul><u...
riese: User Contributed Interlinking
riese: User Contributed Interlinking
riese: issues <ul><li>Dynamic content (Ajax) vs. embedded metadata (RDFa).  Local agent has the data in the DOM, but exter...
Towards Generalising UCI <ul><li>Next step after riese was to decouple the UCI and generalise it. The result is:  I R S  (...
Towards Generalising UCI:  I R S
Towards Generalising UCI:  I R S
I R S  issues <ul><li>Motivation for end-user to contribute has yet to be researched </li></ul><ul><li>Trust issues arise ...
Discussion <ul><li>UCI can help creating high-quality semantic links </li></ul><ul><li>Social process needs to be research...
Upcoming SlideShare
Loading in …5
×

Scripting User Contributed Interlinking

1,074 views

Published on

Presentation about User Contributed Interlinking at Scripting for the Semantic Web (SFSW) 2008 workshop at European Semantic Web Conference (ESWC) 2008

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,074
On SlideShare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Scripting User Contributed Interlinking

  1. 1. Institute of Information Systems & Information Management Scripting User Contributed Interlinking Michael Hausenblas , Wolfgang Halb, and Yves Raimond SFSW08, Tenerife, Spain 2008-06-02
  2. 2. Agenda <ul><li>Linked Data 101 </li></ul><ul><li>A first step in UCI – http://riese.joanneum.at </li></ul><ul><li>Towards Generalising UCI </li></ul><ul><li>Demo </li></ul>
  3. 3. Linked Data: Principles <ul><li>Items should be identified using URI references [ URIrefs ] (and: don’t use bNodes ) </li></ul><ul><li>URIrefs should be dereferenceable: using HTTP URIs allows looking up the items identified through URIrefs, cf. [ http-range-14 TAG finding ] </li></ul><ul><li>Looking up an URIref leads to more data [ follow-your-nose principle ] </li></ul><ul><li>Links to other URIrefs should be included in order to enable the discovery of more data [ How to Publish Linked Data on the Web ] </li></ul>
  4. 4. Linked Data: Datasets (2008) By courtesy of Richard Cyganiak, http://richard.cyganiak.de/2007/10/lod/
  5. 5. Linked Data: Issues <ul><li>Building </li></ul><ul><ul><li>RDFising process (schema, mapping) </li></ul></ul><ul><ul><li>Interlinking (automagically, manual) </li></ul></ul><ul><ul><li>Deployment (SPARQL end point, dump, RDFa, etc.) </li></ul></ul><ul><li>Using </li></ul><ul><ul><li>Provenance, trust, rights, etc. </li></ul></ul><ul><ul><li>Access (depending on deployment) </li></ul></ul><ul><ul><li>Performance (deref chain, reliability) </li></ul></ul><ul><ul><li>Discovery ( which is the right LOD dataset for my task ? ) </li></ul></ul>
  6. 6. A first step in UCI - riese http://riese.joanneum.at
  7. 7. riese: A first step in UCI <ul><li>riese, the ‘ R DFizing and I nterlinking the E uro S tat Dataset E ffort’ aims to offer an RDFised and interlinked version of the Eurostat data ( http:// ec.europa.eu/eurostat ) </li></ul><ul><li>Eurostat data is high-volume data (5 GB data dump in approx. 4,000 TSV files; 350 million data values 80,000 different data codes) </li></ul><ul><li>Currently we serve 3.6 million triples, interlinking with Geonames (DBpedia and Wordnet upcoming) </li></ul><ul><li>Data is exposed as XHTML+RDFa, SPARQL end-point and as dump (+semantic sitemap description) </li></ul>
  8. 8. riese: architecture
  9. 9. riese: inside <ul><li>Server </li></ul><ul><ul><li>Apache 2.2 </li></ul></ul><ul><ul><li>SWI- Prolog </li></ul></ul><ul><ul><li>PHP 5 </li></ul></ul><ul><ul><li>p2r/Ceriese (see Yves’s blog post ) </li></ul></ul><ul><ul><li>(RDF/XML documents in the file system) </li></ul></ul><ul><li>Client </li></ul><ul><ul><li>XHTML+RDFa </li></ul></ul><ul><ul><li>Javascript/Yahoo! Interface Library [ YUI ] </li></ul></ul><ul><li>Vocabulary (triggered the development of scovo, the Statistical Core Vocabulary together with Talis and Lee Feigenbaum, see http:// purl.org /NET/scovo ) </li></ul>
  10. 10. riese: User Contributed Interlinking
  11. 11. riese: User Contributed Interlinking
  12. 12. riese: issues <ul><li>Dynamic content (Ajax) vs. embedded metadata (RDFa). Local agent has the data in the DOM, but external agent can not access it. No real solution, yet. </li></ul><ul><li>Scalability & Performance. When data is fine-granular and high-volume, how much to embed directly in a page? </li></ul><ul><li>How to notify users about data updates? We currently experiment with AtomOwl deployed in RDFa ( http:// riese.joanneum.at /updates/ ) </li></ul>
  13. 13. Towards Generalising UCI <ul><li>Next step after riese was to decouple the UCI and generalise it. The result is: I R S (interlinking of resources with semantics, see also poster session) </li></ul><ul><li>I R S features </li></ul><ul><ul><li>query, add, remove semantic links (owl:sameAs, rdfs:seeAlso, foaf:topic, etc.) </li></ul></ul><ul><ul><li>subject and object can be set by user (restriction: URIs only) </li></ul></ul><ul><ul><li>resource preview (debug) </li></ul></ul><ul><ul><li>expose data in XHTML+RDFa + SPARQL end point </li></ul></ul><ul><ul><li>lookup in http://sindice.com for unknown resources </li></ul></ul><ul><ul><li>simple provenance tracking through named graphs </li></ul></ul>
  14. 14. Towards Generalising UCI: I R S
  15. 15. Towards Generalising UCI: I R S
  16. 16. I R S issues <ul><li>Motivation for end-user to contribute has yet to be researched </li></ul><ul><li>Trust issues arise (experimenting with OpenID) </li></ul><ul><li>Generic UCI requires high level of abstraction (maybe only for geeks and not suitable for an end-user) </li></ul><ul><li>To get an overview of what is available some other mechanism should be offered (currently only SPARQL end point) </li></ul><ul><li>Validation of resources is desirable (e.g. type of target, information vs. non-information resource, etc.) </li></ul>
  17. 17. Discussion <ul><li>UCI can help creating high-quality semantic links </li></ul><ul><li>Social process needs to be researched (might turn out that it is pretty similar to the Wiki ecosystem) </li></ul><ul><li>Some type of content such as multimedia content might benefit more from UCI than others </li></ul><ul><li>Is generic UCI only for geeks? To really be successful, the UCI likely needs to be embedded into a domain-specific application </li></ul><ul><li>BTW, I R S is also a nice LOD debugger ;) </li></ul><ul><li>Questions? </li></ul>

×