Adventures in Linked Data Land




Richard Light Consultancy




Culture Geeks, 25 February 2009
Discovering Linked Data
Four principles of Linked Data (Tim B-L):

    Use URIs to identify resources
●




    Use HTTP U...
Discovering dbPedia

    Extraction of Linked Data from Wikipedia
●

    Statements in info boxes (mainly) become RDF
●

 ...
Browsing Linked Data

    View RDF as a web page:
●

    http://dbpedia.org/page/Berlin

    Navigate from one data source...
dbPedia page for Berlin
OpenLink Data Explorer: What
OpenLink Data Explorer: Where
Querying Linked Data

    SPARQL query language:
●

    http://www.w3.org/TR/2008/REC-rdf-sparql-query-
      20080115/

 ...
dbPedia SPARQL endpoint page
Asking interesting questions

    German musicians born in Berlin:
●

●
So what do we have here?

    An initiative to generate lots of Linked Data
●




    A Linked Data Cloud, containing a gr...
The Wordsworth Trust

    Typical museum collection: about 60,000 objects
●




    Major collection of manuscripts (noteb...
Typical collections object

GRMDC.C104.2
Same object represented as RDF
Same object represented as XTM
One identifier; three “views”

    This object has a single persistent identifier:
●

    http://collections.wordsworth.or...
“Page not found” handler (1)

    All URLs are fictitious, so they generate a 404
●




    Modified a generic smart 404 h...
“Page not found” handler (2)

    Generic URL, plus requested Accept format,
●

    determine initial “303 See other” mapp...
“Page not found” handler (3)

    Redirect rules declare mappings:
●
“Page not found” handler (4)

    Generic URL plus a supported Accept type
●

    generates a “303 See other” redirect

  ...
What has been learnt?
    The Linked Data paradigm encourages simple
●

    RDF triples: no “blank nodes”

    For an obje...
Properties: which framework?

    I have used dbPedia properties (for Linked Data
●

    compatibility):
    http://dbpedi...
The problem of URIs

    Good Linked Data requires URIs everywhere
●




    Most of my museum RDF resolves to strings
●

...
Does it work? - yes, sort of
Data Explorer place view
Implementation details

    HTML needed a “back link” to RDF to keep
●

    OpenLink Explorer happy:
    <link rel=quot;al...
Conclusions
    Implementing an RDF Linked Data front-end to a
●

    museum database is feasible if:
        You can gene...
LD: foothills of the Semantic Web

    Linked Data is a very modest start
●




    It's not obvious how this will scale
●...
Ask Multimap where Lancaster is
Get a Netbook delivered ...
Upcoming SlideShare
Loading in …5
×

Culture Geeks Feb talk: Adventures in Linked Data Land

1,128 views
1,032 views

Published on

Culture Geeks talk: "Adventures in Linked Data Land", by Richard Light.

Feb, 25th 2009 - Regency Town House

Culture Geeks is a Brighton-based community open to everyone who is
interested in using digital technologies in the cultural sector.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,128
On SlideShare
0
From Embeds
0
Number of Embeds
18
Actions
Shares
0
Downloads
3
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Culture Geeks Feb talk: Adventures in Linked Data Land

  1. 1. Adventures in Linked Data Land Richard Light Consultancy Culture Geeks, 25 February 2009
  2. 2. Discovering Linked Data Four principles of Linked Data (Tim B-L): Use URIs to identify resources ● Use HTTP URIs so that people can look them up ● Provide useful information about the resource ● Include links to other URIs in your data ●
  3. 3. Discovering dbPedia Extraction of Linked Data from Wikipedia ● Statements in info boxes (mainly) become RDF ● triples: <rdf:Description rdf:about=quot;http://dbpedia.org/resource/Ber lin_Marathonquot;> <dbpprop:location rdf:resource=quot;http://dbpedia.org/resource/ Berlinquot;/> </rdf:Description> Note the URLs
  4. 4. Browsing Linked Data View RDF as a web page: ● http://dbpedia.org/page/Berlin Navigate from one data source to another ● Specialist Linked Data browsers/plugins: ● DISCO – Marbles – Openlink Data Explorer – Tabulator –
  5. 5. dbPedia page for Berlin
  6. 6. OpenLink Data Explorer: What
  7. 7. OpenLink Data Explorer: Where
  8. 8. Querying Linked Data SPARQL query language: ● http://www.w3.org/TR/2008/REC-rdf-sparql-query- 20080115/ And SPARQL XML results format: ● http://www.w3.org/TR/rdf-sparql-XMLres/ “SPARQL end-points”: ● http://dbpedia.org/sparql http://dbtune.org/bbc/peel/sparql http://data.linkedmdb.org/sparql
  9. 9. dbPedia SPARQL endpoint page
  10. 10. Asking interesting questions German musicians born in Berlin: ● ●
  11. 11. So what do we have here? An initiative to generate lots of Linked Data ● A Linked Data Cloud, containing a growing ● number of RDF datasets A hard-to-use query language capable of very ● precise and powerful querying Where do museums come into this picture?
  12. 12. The Wordsworth Trust Typical museum collection: about 60,000 objects ● Major collection of manuscripts (notebooks, ● letters, etc.) Objects published to the Web from a ModesXML ● database Unwise enough to allow me Remote Desktop ● access ...
  13. 13. Typical collections object GRMDC.C104.2
  14. 14. Same object represented as RDF
  15. 15. Same object represented as XTM
  16. 16. One identifier; three “views” This object has a single persistent identifier: ● http://collections.wordsworth.org.uk/object/GRMDC.C104.2 This maps to different views depending on the ● “Accept” header in the HTTP request: application/rdf+xml >> RDF – application/xtm+xml >> XTM Topic Map – Otherwise >> HTML (human-readable) – Achieved through a custom 404 “page not found” ● handler
  17. 17. “Page not found” handler (1) All URLs are fictitious, so they generate a 404 ● Modified a generic smart 404 handler from: ● http://evolvedcode.net/content/code_smart404/ Added support for “303 See other” redirects ● added wild card matching to re-format URLs ●
  18. 18. “Page not found” handler (2) Generic URL, plus requested Accept format, ● determine initial “303 See other” mapping, e.g.: http://collections.wordsworth.org.uk/object/GRMDC.C104.2 + Accept: application/rdf+xml = http://collections.wordsworth.org.uk/object/rdf/GRMDC.C104.2 When this is passed back in, the 404 handler has to ● generate the required RDF directly Can't just keep redirecting requests! ●
  19. 19. “Page not found” handler (3) Redirect rules declare mappings: ●
  20. 20. “Page not found” handler (4) Generic URL plus a supported Accept type ● generates a “303 See other” redirect If it comes back as a page request, it is further ● redirected with a “301 Moved permanently” to the object's web page If it comes back as an RDF or XTM request, the ● record is fetched as XML and subjected to an XSLT transform by the handler
  21. 21. What has been learnt? The Linked Data paradigm encourages simple ● RDF triples: no “blank nodes” For an object, this becomes a simple metadata set, ● very analogous to the PNDS DCAP format The properties involved need to encapsulate the ● whole relation between object and data, e.g. <p:title>Ulswater from Pooley Bridge</p:title> <p:technique>drawn</p:technique> <p:maker>Farington, Joseph (1747-1821)</p:maker> <p:technique>engraved</p:technique> <p:maker>Middiman, Samuel (1750-1831)</p:maker>
  22. 22. Properties: which framework? I have used dbPedia properties (for Linked Data ● compatibility): http://dbpedia.org/property/title http://dbpedia.org/property/maker A viable alternative would be PNDS DCAP: ● http://purl.org/dc/elements/1.1/title http://purl.org/dc/elements/1.1/creator One framework which doesn't fit is the CIDOC ● CRM: E21 Physical Thing – E12 Production – E39 Actor = “creator”
  23. 23. The problem of URIs Good Linked Data requires URIs everywhere ● Most of my museum RDF resolves to strings ● One exception is Geonames lookup: ● Ullswater becomes http://www.geonames.org/2635191/ In the absence of a central “people” registry, ● should be minting URIs for people myself
  24. 24. Does it work? - yes, sort of
  25. 25. Data Explorer place view
  26. 26. Implementation details HTML needed a “back link” to RDF to keep ● OpenLink Explorer happy: <link rel=quot;alternatequot; type=quot;application/rdf+xmlquot; href=quot;http://collections.wordsworth.org.uk/object/data/GRMDC .C104.2quot; title=quot;RDFquot; /> Result is totally unfindable: need a search or ● harvesting mechanism: – OAI support (possible) – SPARQL end-point (harder)
  27. 27. Conclusions Implementing an RDF Linked Data front-end to a ● museum database is feasible if: You can generate multiple outputs from your database – (XML is sufficient) You can implement a suitable URL rewriter or 404 – handler It's easy (and a good idea) to mint and publish ● URIs for your collection objects It's less clear where all the other URIs we'll need ● will come from
  28. 28. LD: foothills of the Semantic Web Linked Data is a very modest start ● It's not obvious how this will scale ● Full Semantic Web will involve machine-driven ● processes Judging by where we are today, that will be a ● while coming ...
  29. 29. Ask Multimap where Lancaster is
  30. 30. Get a Netbook delivered ...

×