Linked Data



       Victor de Boer
Slide stolen from Christophe Gueret
Why Linked Data?
Why linked data (1/2)




Slide stolen from Christophe Gueret
Why linked data (2/2)




Slide stolen from Christophe Gueret
``Sharable, spreadable and nerd-friendly’’




                           -- Charlotte S H Jensen, kulturweb
Four rules of Linked Data
1. Use URIs as names for things (Resources)

2. Use HTTP URIs so that people can look up those
   names. (Dereferencing)

3. When someone looks up a URI, provide useful
   information, using the standards (RDF*, SPARQL)

4. Include links to other URIs. so that they can
   discover more things.

                          http://www.w3.org/DesignIssues/LinkedData.html
Linked Open Data five star system
                         Available on the web (whatever
              ★
                         format), but with an open license

                         Available as machine-readable
              ★★         structured data (e.g. excel instead
                         of image scan of a table)

                         as (2) plus non-proprietary format
              ★★★
                         (e.g. CSV instead of excel)
                         All the above plus, Use open
                         standards from W3C (RDF and
              ★★★★
                         SPARQL) to identify things, so that
                         people can point at your stuff
                         All the above, plus: Link your data
              ★★★★★      to other people’s data to provide
                         context


                   www.w3.org/designissues/linkeddata.html
Linked Data Cloud Diagram
May 2007
Oct 2007
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
Amsterdam Museum as Linked Open Data
Use case on how to transform “raw” XML data into 5-star Linked Open Data
Europeana
• “Europeana enables people to explore the digital
  resources of Europe's museums, libraries, archives and
  audio-visual collections.’’
                                     www.europeana.eu




 From portal…                                              …to data aggregator.
Amsterdam Museum
• Formerly Amsterdam Historic Museum
   – “The rich collection of works of art, objects
     and archaeological finds brings to life the
     fortunes of Amsterdammers of days gone
     by and today.”

• In March 2010 published their whole
  collection online
   – 70.000 objects
   – CC license

• We converted their data to RDF
AM metadata
                                                 <record priref="10541“ >
• Adlib database XML API                            <acquisition.date>1997</acquisition.date>
                                                    <dimension>
                                                      <dimension.type>hoogte</dimension.type>
                                                      <dimension.unit>cm</dimension.unit>
                                                      <dimension.value>6</dimension.value>
• Object metadata                                …
                                                    </dimension>

      • 73.000 objects, 256MB                    </record>

      • Nested XML
• Concept Thesaurus                          <record priref="28024“ >
                                                <term>Kalverstraat 124</term>
                                                <broader_term>Kalverstraat</broader_term>
      • 27.000, 9MB                             <term.type>GEOKEYW </term.type>
      • Different types (geo,motif, event)   </record>


• Person ‘Thesaurus’                                  <record priref="6" >
      • 67.000 persons, 10MB                             <biography>boekverkoper en uitgever van
                                                      cartografie</biography>
      • Consolidated from object metadata fields         <birth.date.start>1659</birth.date.start>
      • Creators, annotators, reproduction               <death.date.start>1733</death.date.start>
                                                         <name>Aa, Pieter van der</name>
        creators, institutions,                          <nationality>Nederlands</nationality>
                                                         <use>Aa, Pieter van der (I)</use>
                                                       </record>
Back to the four rules of Linked Data
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up
   those names.
3. When someone looks up a URI, provide
   useful information, using the standards
   (RDF*, SPARQL)
4. Include links to other URIs. so that they can
   discover more things.

                       http://www.w3.org/DesignIssues/LinkedData.html
How to make cool URI’s
Use HTTP://
Use a namespace you control
Unique, stable and persistent

• Don’t use:
  – Author name, subject, status, access, file name
    extension, software mechanism
  C://MyDisk/awesome/VdeBoer/latest/cgi_bin/rembrandt.html
Amsterdam Museum URIs
• PURL basename: http://purl.org/collections/nl/am/

• Objects: Use “prirefs”, prefixed by “proxy-”
   – http://purl.org/collections/nl/am/proxy-63432


• Concepts & Persons: Use “prirefs”, prefixed by “p-”, or “t-”
   – http://purl.org/collections/nl/am/p-201


• Properties (schema): Use XML element name
   – http://purl.org/collections/nl/am/acquisition.date
Again, the rules of Linked Data
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up
   those names.
3. When someone looks up a URI, provide
   useful information, using the standards
   (RDF*, SPARQL)
4. Include links to other URIs. so that they can
   discover more things.

                       http://www.w3.org/DesignIssues/LinkedData.html
RDF reminder
Subject           Predicate         Object

am:Rembrandt      am:hasBirthdate   “1651”
                                                                         Triples
am:Rembrandt      foaf:knows        am:PiterLastman

am:PiterLastman   am:wasBornIn      geonames:Amsterdam




                                     “1651”              geonames:Amsterdam

      am:Rembrandt
                                                                         Graph
                                       am:PiterLastman
RDF conversion
<record priref="19319 “ >
   <date>1651</date>
   <maker>Rembrandt (1606-1669)</maker>
   <object.type>etsplaat</object.type>                          priref                “19319 ”
…                                                                        date
</record>
                                                                                      “1651”
                                                             am:Record
                                                               _:bn1                         “Rembrandt (1606-1669)”

                                                                  object.type                “etsplaat”

                            “19319 ”
                  am:date          “1651”
                                                                                     “1234”
                                                                 am:priref
        am:Record                                                        am:birthdate
                             am:maker                    am:Person
      am:proxy-19319                                                            “1606”
                                                         am:p-1234
                                                                    rda:name             “Rembrandt”


                                          skos:Concept
                                           am:etsplaat
                                                                                “etsplaat”
                                                               skos:prefLabel
Architecture
 SPARQL-app                      Browser


                                                Purl.org
                                                redirect


   SPARQL                      Web interface

                 HTTP server


RDF(s) storage                     Logic


                   Prolog


                                           http://semanticweb.cs.vu.nl/
How to access the data
• PURL 303 redirect to VU semantic layer
   http://purl.org/collections/nl/am/proxy-63432
   
   http://semanticweb.cs.vu.nl/europeana/browse/list_resource?r=h
   ttp://purl.org/collections/nl/am/proxy-63432

• At our server: content negotiation
   – HTTP request text/html:
      • Local condensed view
      • Local full view
   – HTTP request application/rdf+xml
      • rdf/xml “describe”

• SPARQL endpoint
text/html
text/html
application/rdf+xml
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ore: <http://www.openarchives.org/ore/terms/> .
@prefix ens: <http://www.europeana.eu/schemas/edm/> .
@prefix ahm: <http://purl.org/collections/nl/am/>


ahm:proxy-66970
       a ore:Proxy ;
       ahm:title "Zegelstempel Felix Meritis"@nl ;
       ahm:material ahm:t-12463 ,
                   ahm:t-5447 ;
       ahm:objectCategory ahm:t-5504 ;
       ahm:objectName ahm:t-13817 ,
                     ahm:t-8489 ;
       ahm:objectNumber "KA 7653.1" ;
       ahm:priref "66970" .

ahm:proxy-66972
       a ore:Proxy ;
       ahm:acquisitionDate "0000" ;
       ahm:title "Zegelstempel mogelijk van familiewapen"@nl .
SPARQL




http://semanticweb.cs.vu.nl/europeana/user/query
Again, the rules of Linked Data
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up
   those names.
3. When someone looks up a URI, provide
   useful information, using the standards
   (RDF*, SPARQL)
4. Include links to other URIs. so that they can
   discover more things.

                       http://www.w3.org/DesignIssues/LinkedData.html
Link to other sources
                     “19319 ”
           am:date         “1651”
                                                                “1234”
                                             am:priref
 am:Record                                           am:birthdate
                      am:maker       am:Person
am:proxy-19319                                              “1606”
                                     am:p-1234
                                                 rda:name            “Rembrandt”




                                            owl:sameAs (?)




                                                              Viaf:nationality
                                         Viaf:Person                              “Dutch”
                                    Viaf:RebrandtvanRijn
                                                                                 “Rembrandt
                                                                                 Harmensz.
                                                             rdfs:label
                                                                                 Van Rijn”
Amalgame alignment platform
• Semi-automatic matching
   – Simple automatic
     techniques,
   – chained together by hand

• 3500+ links put in RDF
   – 143 places linked to
     GeoNames
   – 1076 persons linked to
     ULAN (VIAF)
   – 34 persons linked to
     DBPedia
   – 2498 concepts AATNed.
CKAN Data Hub




http://thedatahub.org/dataset/amsterdam-museum-as-edm-lod
Four rules and Five stars
1. Use URIs as names for
   things
2. Use HTTP URIs so that
   people can look up those
   names.
3. When someone looks up a
   URI, provide useful
   information, using the
   standards (RDF*, SPARQL)
4. Include links to other URIs.
   so that they can discover
   more things.
And now applications!…right??
Developers still do this…




                            …although more
                             and more of this
                             is happening
Some issues with L(O)D
• Extra burden on the data provider
• Nerd-only (aka “SPARQL is hard”)
• How do we build user-friendly systems?
  – Ranking, user-friendly information presentation

• Scalability (how do you query a huge graph?)

• Licenses
• Is Open always a good idea?
  – Context?
end
EDM
What kind of RDF?
• Europeana Data Model (EDM)
  – Keep original metadata intact
  – Use sem web (LD) principles: RDF

• Re-use of standard models
  – Dublin Core for metadata representation
     • creator, date, title etc.
  – SKOS for vocabularies
     • preferredLabel, hasBroader, etc.
EDM voorbeeld
               Provenance
                  +web
              views/plaatjes



                                                 proxy
                                                                object
                                                               metadata
Aggregation




                               Physical Object
                                                     geen
                                                    metadata

Linked data: Four rules and five stars for the Amsterdam Museum

  • 1.
    Linked Data Victor de Boer Slide stolen from Christophe Gueret
  • 2.
  • 3.
    Why linked data(1/2) Slide stolen from Christophe Gueret
  • 4.
    Why linked data(2/2) Slide stolen from Christophe Gueret
  • 5.
    ``Sharable, spreadable andnerd-friendly’’ -- Charlotte S H Jensen, kulturweb
  • 7.
    Four rules ofLinked Data 1. Use URIs as names for things (Resources) 2. Use HTTP URIs so that people can look up those names. (Dereferencing) 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs. so that they can discover more things. http://www.w3.org/DesignIssues/LinkedData.html
  • 8.
    Linked Open Datafive star system Available on the web (whatever ★ format), but with an open license Available as machine-readable ★★ structured data (e.g. excel instead of image scan of a table) as (2) plus non-proprietary format ★★★ (e.g. CSV instead of excel) All the above plus, Use open standards from W3C (RDF and ★★★★ SPARQL) to identify things, so that people can point at your stuff All the above, plus: Link your data ★★★★★ to other people’s data to provide context www.w3.org/designissues/linkeddata.html
  • 9.
  • 10.
  • 11.
  • 14.
    “Linking Open Datacloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
  • 15.
    “Linking Open Datacloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
  • 16.
    Amsterdam Museum asLinked Open Data
  • 17.
    Use case onhow to transform “raw” XML data into 5-star Linked Open Data
  • 18.
    Europeana • “Europeana enablespeople to explore the digital resources of Europe's museums, libraries, archives and audio-visual collections.’’ www.europeana.eu From portal… …to data aggregator.
  • 19.
    Amsterdam Museum • FormerlyAmsterdam Historic Museum – “The rich collection of works of art, objects and archaeological finds brings to life the fortunes of Amsterdammers of days gone by and today.” • In March 2010 published their whole collection online – 70.000 objects – CC license • We converted their data to RDF
  • 20.
    AM metadata <record priref="10541“ > • Adlib database XML API <acquisition.date>1997</acquisition.date> <dimension> <dimension.type>hoogte</dimension.type> <dimension.unit>cm</dimension.unit> <dimension.value>6</dimension.value> • Object metadata … </dimension> • 73.000 objects, 256MB </record> • Nested XML • Concept Thesaurus <record priref="28024“ > <term>Kalverstraat 124</term> <broader_term>Kalverstraat</broader_term> • 27.000, 9MB <term.type>GEOKEYW </term.type> • Different types (geo,motif, event) </record> • Person ‘Thesaurus’ <record priref="6" > • 67.000 persons, 10MB <biography>boekverkoper en uitgever van cartografie</biography> • Consolidated from object metadata fields <birth.date.start>1659</birth.date.start> • Creators, annotators, reproduction <death.date.start>1733</death.date.start> <name>Aa, Pieter van der</name> creators, institutions, <nationality>Nederlands</nationality> <use>Aa, Pieter van der (I)</use> </record>
  • 22.
    Back to thefour rules of Linked Data 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs. so that they can discover more things. http://www.w3.org/DesignIssues/LinkedData.html
  • 23.
    How to makecool URI’s Use HTTP:// Use a namespace you control Unique, stable and persistent • Don’t use: – Author name, subject, status, access, file name extension, software mechanism C://MyDisk/awesome/VdeBoer/latest/cgi_bin/rembrandt.html
  • 24.
    Amsterdam Museum URIs •PURL basename: http://purl.org/collections/nl/am/ • Objects: Use “prirefs”, prefixed by “proxy-” – http://purl.org/collections/nl/am/proxy-63432 • Concepts & Persons: Use “prirefs”, prefixed by “p-”, or “t-” – http://purl.org/collections/nl/am/p-201 • Properties (schema): Use XML element name – http://purl.org/collections/nl/am/acquisition.date
  • 25.
    Again, the rulesof Linked Data 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs. so that they can discover more things. http://www.w3.org/DesignIssues/LinkedData.html
  • 26.
    RDF reminder Subject Predicate Object am:Rembrandt am:hasBirthdate “1651” Triples am:Rembrandt foaf:knows am:PiterLastman am:PiterLastman am:wasBornIn geonames:Amsterdam “1651” geonames:Amsterdam am:Rembrandt Graph am:PiterLastman
  • 27.
    RDF conversion <record priref="19319“ > <date>1651</date> <maker>Rembrandt (1606-1669)</maker> <object.type>etsplaat</object.type> priref “19319 ” … date </record> “1651” am:Record _:bn1 “Rembrandt (1606-1669)” object.type “etsplaat” “19319 ” am:date “1651” “1234” am:priref am:Record am:birthdate am:maker am:Person am:proxy-19319 “1606” am:p-1234 rda:name “Rembrandt” skos:Concept am:etsplaat “etsplaat” skos:prefLabel
  • 28.
    Architecture SPARQL-app Browser Purl.org redirect SPARQL Web interface HTTP server RDF(s) storage Logic Prolog http://semanticweb.cs.vu.nl/
  • 29.
    How to accessthe data • PURL 303 redirect to VU semantic layer http://purl.org/collections/nl/am/proxy-63432  http://semanticweb.cs.vu.nl/europeana/browse/list_resource?r=h ttp://purl.org/collections/nl/am/proxy-63432 • At our server: content negotiation – HTTP request text/html: • Local condensed view • Local full view – HTTP request application/rdf+xml • rdf/xml “describe” • SPARQL endpoint
  • 30.
  • 31.
  • 32.
    application/rdf+xml @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix ore: <http://www.openarchives.org/ore/terms/> . @prefix ens: <http://www.europeana.eu/schemas/edm/> . @prefix ahm: <http://purl.org/collections/nl/am/> ahm:proxy-66970 a ore:Proxy ; ahm:title "Zegelstempel Felix Meritis"@nl ; ahm:material ahm:t-12463 , ahm:t-5447 ; ahm:objectCategory ahm:t-5504 ; ahm:objectName ahm:t-13817 , ahm:t-8489 ; ahm:objectNumber "KA 7653.1" ; ahm:priref "66970" . ahm:proxy-66972 a ore:Proxy ; ahm:acquisitionDate "0000" ; ahm:title "Zegelstempel mogelijk van familiewapen"@nl .
  • 33.
  • 34.
    Again, the rulesof Linked Data 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs. so that they can discover more things. http://www.w3.org/DesignIssues/LinkedData.html
  • 35.
    Link to othersources “19319 ” am:date “1651” “1234” am:priref am:Record am:birthdate am:maker am:Person am:proxy-19319 “1606” am:p-1234 rda:name “Rembrandt” owl:sameAs (?) Viaf:nationality Viaf:Person “Dutch” Viaf:RebrandtvanRijn “Rembrandt Harmensz. rdfs:label Van Rijn”
  • 36.
    Amalgame alignment platform •Semi-automatic matching – Simple automatic techniques, – chained together by hand • 3500+ links put in RDF – 143 places linked to GeoNames – 1076 persons linked to ULAN (VIAF) – 34 persons linked to DBPedia – 2498 concepts AATNed.
  • 37.
  • 38.
    Four rules andFive stars 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs. so that they can discover more things.
  • 39.
  • 40.
    Developers still dothis… …although more and more of this is happening
  • 43.
    Some issues withL(O)D • Extra burden on the data provider • Nerd-only (aka “SPARQL is hard”) • How do we build user-friendly systems? – Ranking, user-friendly information presentation • Scalability (how do you query a huge graph?) • Licenses • Is Open always a good idea? – Context?
  • 44.
  • 45.
  • 46.
    What kind ofRDF? • Europeana Data Model (EDM) – Keep original metadata intact – Use sem web (LD) principles: RDF • Re-use of standard models – Dublin Core for metadata representation • creator, date, title etc. – SKOS for vocabularies • preferredLabel, hasBroader, etc.
  • 47.
    EDM voorbeeld Provenance +web views/plaatjes proxy object metadata Aggregation Physical Object geen metadata

Editor's Notes

  • #8 Things = “resources”
  • #21 - Not completely straightforward xml (nestedness)
  • #28 XMLRDF tool: clean up, link to resources etc.
  • #34 PREFIX am: &lt;http://purl.org/collections/nl/am/&gt;PREFIX skos: &lt;http://www.w3.org/2004/02/skos/core#&gt;SELECT ?proxy ?xWHERE {?proxy am:material ?x. ?x skos:prefLabel &quot;gietijzer&quot;@nl}ORDER BY ?proxyLIMIT 50
  • #40 Apps for AmsterdamPlaatsen van Betekenis
  • #45 Apps for AmsterdamPlaatsen van Betekenis
  • #46 Apps for AmsterdamPlaatsen van Betekenis