0
Semantic Web and Linked Data for cultural heritage materials  Approaches in Europeana Antoine Isaac Vrije Universiteit Ams...
A web of cultural heritage data? ?
<ul><li>? </li></ul>
The current portal
 
Towards semantic search: facets
<ul><li>Building a search engine on top of metadata is difficult </li></ul><ul><ul><li>Intrinsic quality problems: correct...
<ul><li>We can better use institutions’ original metadata </li></ul><ul><li>Accommodate their different practices </li></u...
Towards semantics-enabled search <ul><li>Building a &quot;semantic layer&quot; to help accessing content </li></ul>
Towards semantics-enabled search <ul><li>Enhance access to Europeana content by semantics </li></ul><ul><ul><li>Query expa...
Prototype: Europeana Thought Lab <ul><ul><li>http://europeana.eu/portal/thought-lab.html </li></ul></ul>
Semantic auto-completion
Clustering of results
Baseline: matching concepts' label Controlled place name from a vocabulary at the Rijskmuseum Metadata for the object
A &quot;more specific Egypte&quot;?
A &quot;more specific Egypte&quot;? Metadata for the object
A place more specific than the Egypt one Semantic information on the Giza place in the Rijskmuseum Vocabulary
Following other relations
Following other relations - creator Metadata for the object Controlled person name from a vocabulary at the Rijskmuseum
Following other relations - match Information on Gustave Le Gray from the Rijskmuseum Vocabulary Matched to a &quot;Gustav...
Following other relations – death place Information on Gustave Le Gray from the Union List of Artist Names (Getty)
Following other relations – death place Information on Cairo from the Thesaurus of Geographic Names (Getty) Matched to &qu...
<ul><li>A hell of relations? </li></ul><ul><li>Well, they were in the original data, we just had to make them  explicit! <...
Enabling bits & pieces <ul><li>Exploiting semantic links in CH vocabularies </li></ul><ul><ul><li>Rijksmuseum thesaurus:  ...
SKOS, Knowledge Organization Systems and Linked Data <ul><li>SKOS allows representing (simple) KOS data as RDF </li></ul>a...
SKOS, KOSs and LD <ul><li>SKOS allows bridging across KOSs from different contexts </li></ul>http://www.w3.org/2004/02/sko...
SKOS is used! <ul><li>Many Libraries – not a surprise! </li></ul><ul><ul><li>Swedish National Library’s Libris catalogue a...
KOS Alignments? <ul><li>Quite many of them are linked to some other resource </li></ul><ul><li>LCSH, SWD and RAMEAU interl...
Enabling bits & pieces (c’ed) <ul><li>Appropriate data model for objects </li></ul><ul><ul><li>Generic constructs for crea...
Formal semantics, metadata schemas and querying <ul><li>The query: </li></ul><ul><li>The existing description: </li></ul><...
Where are the challenges? <ul><li>Semantic conversion of data </li></ul><ul><ul><li>Using appropriate data models </li></u...
Alignment of semantic references
Where are the challenges? <ul><li>Semantic alignment (c'ed) </li></ul><ul><ul><li>Find correspondences between large vocab...
The Europeana Data Model (EDM)  with input from Carlo Meghini, Guus Schreiber, Stefan Gradmann, Maxx Dekkers, Steffen Henn...
Rationale of EDM <ul><li>Precursor: ESE (Europeana Semantic Elements) </li></ul><ul><ul><li>represents lowest common denom...
EDM requirements & principles <ul><li>Distinction between “provided object” (painting, book, program) and digital represen...
EDM basics <ul><li>OAI ORE for organization of metadata about an object </li></ul><ul><li>Dublin Core for metadata represe...
Dublin Core <ul><li>EDM uses the latest version of  DCMI Metadata Terms for a core of semantically interoperable propertie...
SKOS: vocabulary publication on the Web <ul><li>Already seen… </li></ul>
OAI ORE <ul><li>Specification: </li></ul><ul><li>http://www.openarchives.org/ore/1.0/toc.html  </li></ul><ul><li>Specified...
The Example - 1
The Example - 2
Aggregation organizes data of a provider  aggregation digital representation object provenance metadata
Proxy: metadata record for an object proxy object metadata
Multiple aggregations = multiple providers aggregation  of DMF aggregation  of Louvre
Multiple aggregations = multiple providers DMF proxy Louvre P roxy Louvre title DMF title The “real” painting
Europeana is “just” a special provider with processed/enriched metadata Europeana aggregation enriched metadata Europeana ...
A flexible model: different semantic grains <ul><li>Cf. goal: “preserve original data while still allowing for interoperab...
A flexible model: objects, events and the rest <ul><li>Preserving and exploiting original data also means being compatible...
A flexible model: object and events (2) <ul><li>Classes and Properties for event-, agent-, place-centric modeling </li></u...
A flexible model: object and events (3)
Advanced modeling in EDM <ul><li>Relations between provided objects </li></ul><ul><ul><li>Part-whole links for complex (hi...
Linked data and cultural heritage?
The case for linked data in cultural heritage <ul><li>Not just a more sophisticated way to represent data! </li></ul><ul><...
<ul><li>From a movement supported by researchers </li></ul><ul><li>To much wider awareness </li></ul><ul><ul><li>Open gove...
Linked Library Cloud beginning 2008 [Ross Singer, Code4Lib2010] http://code4lib.org/conference/2010/singer
Linked Library Cloud mid-2010 <ul><li>Plus: </li></ul><ul><li>Germany NL </li></ul><ul><li>Hungary NL </li></ul><ul><li>ST...
Is that a surprise? <ul><li>Not really, let’s have a look at a real-world case… </li></ul>
Johan Stapel, Koninklijke Bibliotheek KOS & collection environment @KB
<ul><li>A broad range of datasets </li></ul><ul><li>That describe the same  objects </li></ul><ul><li>Or  related  objects...
Thanks! <ul><li>[email_address] </li></ul><ul><li>Europeana.eu team </li></ul><ul><li>Web and Media lab @ Vrije Universite...
Upcoming SlideShare
Loading in...5
×

Semantic Web and Linked Data for cultural heritage materials - Approaches in Europeana

2,793

Published on

Presented at DANS Workshop on Linked Data

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,793
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
61
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "Semantic Web and Linked Data for cultural heritage materials - Approaches in Europeana"

  1. 1. Semantic Web and Linked Data for cultural heritage materials Approaches in Europeana Antoine Isaac Vrije Universiteit Amsterdam Europeana DANS Linked Data and RDF workshop, Den Haag, July 28 th 2010
  2. 2. A web of cultural heritage data? ?
  3. 3. <ul><li>? </li></ul>
  4. 4. The current portal
  5. 6. Towards semantic search: facets
  6. 7. <ul><li>Building a search engine on top of metadata is difficult </li></ul><ul><ul><li>Intrinsic quality problems: correctness, coverage </li></ul></ul><ul><li>Especially when data is so heterogeneous </li></ul><ul><ul><li>100s of formats </li></ul></ul><ul><ul><li>From flat 5-fields records to 100-nodes XML trees </li></ul></ul><ul><ul><li>Language issue! </li></ul></ul><ul><li>We currently use a simple interoperability format </li></ul><ul><ul><li>Quick-win showing quickly its limits </li></ul></ul>
  7. 8. <ul><li>We can better use institutions’ original metadata </li></ul><ul><li>Accommodate their different practices </li></ul><ul><ul><li>Data structures and semantics </li></ul></ul><ul><li>Access objects via a semantic layer of vocabularies for subjects, persons, places… </li></ul>Semantic ThoughtLab: experimenting solutions
  8. 9. Towards semantics-enabled search <ul><li>Building a &quot;semantic layer&quot; to help accessing content </li></ul>
  9. 10. Towards semantics-enabled search <ul><li>Enhance access to Europeana content by semantics </li></ul><ul><ul><li>Query expansion, clustering of results </li></ul></ul><ul><li>Exploiting various types of relations </li></ul><ul><ul><li>&quot;located in&quot;, &quot;lived in&quot;, &quot;is more specific concept&quot;… </li></ul></ul><ul><li>Semantics are already there, in metadata and &quot;controlled vocabularies&quot; used in metadata </li></ul><ul><ul><li>Thesauri, classifications… </li></ul></ul><ul><li>Requires to make it properly machine-accessible </li></ul>
  10. 11. Prototype: Europeana Thought Lab <ul><ul><li>http://europeana.eu/portal/thought-lab.html </li></ul></ul>
  11. 12. Semantic auto-completion
  12. 13. Clustering of results
  13. 14. Baseline: matching concepts' label Controlled place name from a vocabulary at the Rijskmuseum Metadata for the object
  14. 15. A &quot;more specific Egypte&quot;?
  15. 16. A &quot;more specific Egypte&quot;? Metadata for the object
  16. 17. A place more specific than the Egypt one Semantic information on the Giza place in the Rijskmuseum Vocabulary
  17. 18. Following other relations
  18. 19. Following other relations - creator Metadata for the object Controlled person name from a vocabulary at the Rijskmuseum
  19. 20. Following other relations - match Information on Gustave Le Gray from the Rijskmuseum Vocabulary Matched to a &quot;Gustave Le Gray&quot; from another Vocabulary
  20. 21. Following other relations – death place Information on Gustave Le Gray from the Union List of Artist Names (Getty)
  21. 22. Following other relations – death place Information on Cairo from the Thesaurus of Geographic Names (Getty) Matched to &quot;Cairo&quot; from another vocabulary…
  22. 23. <ul><li>A hell of relations? </li></ul><ul><li>Well, they were in the original data, we just had to make them explicit! </li></ul><ul><li>Cultural Heritage institution often have a wealth of metadata to share and exploit </li></ul>
  23. 24. Enabling bits & pieces <ul><li>Exploiting semantic links in CH vocabularies </li></ul><ul><ul><li>Rijksmuseum thesaurus: </li></ul></ul><ul><ul><li>Concept “Giza” narrower than concept “Egypte” </li></ul></ul><ul><li>Mapping/alignment between CH vocabularies </li></ul><ul><ul><li>Louvre’s “Égypte” equivalent to Rijksmuseum’s “Egypte” </li></ul></ul><ul><li>Enrichment of existing metadata </li></ul><ul><ul><li>The string “Egypt” in a metadata record indicates the concept of Egypt defined in Rijksmuseum thesaurus </li></ul></ul>
  24. 25. SKOS, Knowledge Organization Systems and Linked Data <ul><li>SKOS allows representing (simple) KOS data as RDF </li></ul>animals NT cats cats UF domestic cats RT wildcats BT animals SN used only for domestic cats domestic cats USE cats wildcats
  25. 26. SKOS, KOSs and LD <ul><li>SKOS allows bridging across KOSs from different contexts </li></ul>http://www.w3.org/2004/02/skos/
  26. 27. SKOS is used! <ul><li>Many Libraries – not a surprise! </li></ul><ul><ul><li>Swedish National Library’s Libris catalogue and thesaurus http://libris.kb.se/ </li></ul></ul><ul><ul><li>Library of Congress’ vocabularies, including LCSH http://id.loc.gov/ </li></ul></ul><ul><ul><li>DNB’s Gemeinsame Normdatei (incl. SWD subject headings) http://d-nb.info/gnd/ </li></ul></ul><ul><ul><ul><li>Documentation at https://wiki.d-nb.de/display/LDS </li></ul></ul></ul><ul><ul><li>BnF’s RAMEAU subject headings http://stitch.cs.vu.nl/ </li></ul></ul><ul><ul><li>OCLC’s DDC classification http://dewey.info/ and VIAF http://viaf.org/ </li></ul></ul><ul><ul><li>STW economy thesaurus http:// zbw.eu/stw </li></ul></ul><ul><ul><li>National Library of Hungary’s catalogue and thesauri http:// oszkdk.oszk.hu/resource/DRJ/404 (example) </li></ul></ul><ul><li>Other fields </li></ul><ul><ul><li>Wikipedia categories through Dbpedia http://dbpedia.org/ </li></ul></ul><ul><ul><li>New York Times subject headings http://data.nytimes.com/ </li></ul></ul><ul><ul><li>IVOA astronomy vocabularies http://www.ivoa.net/Documents/latest/Vocabularies.html </li></ul></ul><ul><ul><li>GEMET environmental thesaurus http://eionet.europa.eu/gemet </li></ul></ul><ul><ul><li>UMTHES </li></ul></ul><ul><ul><li>Agrovoc http://aims.fao.org/ </li></ul></ul><ul><ul><li>Linked Life Data http://linkedlifedata.com/ </li></ul></ul><ul><ul><li>Taxonconcept http://www.taxonconcept.org/ </li></ul></ul><ul><ul><li>UK Public sector vocabularies http://standards.esd.org.uk/ (e.g., http://id.esd.org.uk/lifeEvent/7 ) </li></ul></ul>
  27. 28. KOS Alignments? <ul><li>Quite many of them are linked to some other resource </li></ul><ul><li>LCSH, SWD and RAMEAU interlinked through MACS mappings </li></ul><ul><li>GND linked to DBpedia and VIAF </li></ul><ul><li>Libris linked to LCSH </li></ul><ul><li>Agrovoc to CAT, NAL, SWD, GEMET </li></ul><ul><li>NYT to freebase, DBpedia, Geonames </li></ul><ul><li>dbPedia links are overwhelming </li></ul><ul><ul><li>Hungary, STW, TaxonConcept, GND… </li></ul></ul>
  28. 29. Enabling bits & pieces (c’ed) <ul><li>Appropriate data model for objects </li></ul><ul><ul><li>Generic constructs for creation, title, subject, etc. that are useful for querying </li></ul></ul><ul><li>Flexible data model </li></ul><ul><ul><li>SW ontology linking features allow to keep close to original data while having the generic notions above </li></ul></ul>
  29. 30. Formal semantics, metadata schemas and querying <ul><li>The query: </li></ul><ul><li>The existing description: </li></ul><ul><li>Why is there a match? </li></ul><ul><li>For the Europeana ontology, every rma:depicts statement implies a vra:subject statement </li></ul>rma: gezicht_in_cairo rma:Cairo rma:depicts rma:Egypt skos:broader ?x ?y vra:subject rma:Egypt skos:broader
  30. 31. Where are the challenges? <ul><li>Semantic conversion of data </li></ul><ul><ul><li>Using appropriate data models </li></ul></ul><ul><ul><li>Enriching legacy metadata </li></ul></ul><ul><li>Semantic alignments </li></ul><ul><ul><li>Between description ontologies </li></ul></ul><ul><ul><ul><li>vra:depicts rdfs:subPropertyOf dc:subject </li></ul></ul></ul><ul><ul><li>Between concepts in controlled vocabularies </li></ul></ul><ul><ul><ul><li>iconclass:bird skos:closeMatch ddc:bird </li></ul></ul></ul>
  31. 32. Alignment of semantic references
  32. 33. Where are the challenges? <ul><li>Semantic alignment (c'ed) </li></ul><ul><ul><li>Find correspondences between large vocabularies </li></ul></ul><ul><ul><li>In a multilingual context </li></ul></ul><ul><li>Scalability </li></ul><ul><ul><li>Plugging the semantic features into the Europeana production environment </li></ul></ul>
  33. 34. The Europeana Data Model (EDM) with input from Carlo Meghini, Guus Schreiber, Stefan Gradmann, Maxx Dekkers, Steffen Hennicke, Viktor de Boer et al. from Europeana V1
  34. 35. Rationale of EDM <ul><li>Precursor: ESE (Europeana Semantic Elements) </li></ul><ul><ul><li>represents lowest common denominator for object metadata </li></ul></ul><ul><ul><ul><li>convert datasets to Dublin-Core like standard </li></ul></ul></ul><ul><ul><li>forces interoperability </li></ul></ul><ul><ul><li>major drawback: original metadata is lost </li></ul></ul><ul><ul><li>most values are simple strings </li></ul></ul><ul><li>EDM goals </li></ul><ul><ul><li>preserve original data while still allowing for interoperability </li></ul></ul><ul><ul><li>Semantic Web representation </li></ul></ul><ul><li>A community-driven effort </li></ul><ul><ul><li>C ore experts, validation by representatives of various CH domains </li></ul></ul>
  35. 36. EDM requirements & principles <ul><li>Distinction between “provided object” (painting, book, program) and digital representation </li></ul><ul><li>Distinction between object and metadata record describing an object </li></ul><ul><li>Allow for multiple records for same object, containing potentially contradictory statements about an object </li></ul><ul><li>Support for objects that are composed of other objects </li></ul><ul><li>Standard metadata format that can be specialized </li></ul><ul><li>Standard vocabulary format that can be specialized </li></ul><ul><li>EDM should be based on existing standards </li></ul>
  36. 37. EDM basics <ul><li>OAI ORE for organization of metadata about an object </li></ul><ul><li>Dublin Core for metadata representation </li></ul><ul><li>SKOS for vocabulary representation </li></ul><ul><li>+ Links to CIDOC-CRM and other shared ontologies </li></ul>
  37. 38. Dublin Core <ul><li>EDM uses the latest version of DCMI Metadata Terms for a core of semantically interoperable properties </li></ul><ul><ul><li>And for backward compatibility, cf. ESE </li></ul></ul><ul><li>Specified with an RDF model </li></ul><ul><li>Specialization of 15 original DC elements </li></ul><ul><li>Can be specialized itself </li></ul><ul><ul><li>see requirement -> this is a crucial distinction with ESE </li></ul></ul><ul><li>Used in the richest way possible </li></ul><ul><ul><li>Pointers to resources </li></ul></ul>
  38. 39. SKOS: vocabulary publication on the Web <ul><li>Already seen… </li></ul>
  39. 40. OAI ORE <ul><li>Specification: </li></ul><ul><li>http://www.openarchives.org/ore/1.0/toc.html </li></ul><ul><li>Specified with an RDF model </li></ul><ul><li>Four key notions (RDF classes) </li></ul><ul><ul><li>Object : the book/painting/program being described </li></ul></ul><ul><ul><li>Aggregation : organizes object information from a particular provider (museum, archive, library) </li></ul></ul><ul><ul><li>Proxy : the object as viewed in a metadata record </li></ul></ul><ul><ul><li>Digital representation : some digital form of the object with a Web address </li></ul></ul>
  40. 41. The Example - 1
  41. 42. The Example - 2
  42. 43. Aggregation organizes data of a provider aggregation digital representation object provenance metadata
  43. 44. Proxy: metadata record for an object proxy object metadata
  44. 45. Multiple aggregations = multiple providers aggregation of DMF aggregation of Louvre
  45. 46. Multiple aggregations = multiple providers DMF proxy Louvre P roxy Louvre title DMF title The “real” painting
  46. 47. Europeana is “just” a special provider with processed/enriched metadata Europeana aggregation enriched metadata Europeana landing page
  47. 48. A flexible model: different semantic grains <ul><li>Cf. goal: “preserve original data while still allowing for interoperability” </li></ul><ul><li>Keep data expressed as close as possible to original model </li></ul><ul><li>Using mappings to more interoperable level </li></ul>
  48. 49. A flexible model: objects, events and the rest <ul><li>Preserving and exploiting original data also means being compatible with descriptions beyond simple object level </li></ul><ul><li>Also crucial for semantic enrichment </li></ul>
  49. 50. A flexible model: object and events (2) <ul><li>Classes and Properties for event-, agent-, place-centric modeling </li></ul><ul><li>Instances of (local) vocabularies using skos:Concept </li></ul><ul><li>Using RDF, EDM allows any kind of network to be attached to a provided object. </li></ul>
  50. 51. A flexible model: object and events (3)
  51. 52. Advanced modeling in EDM <ul><li>Relations between provided objects </li></ul><ul><ul><li>Part-whole links for complex (hierarchical) objects </li></ul></ul><ul><ul><li>Derivation and versioning relations </li></ul></ul><ul><ul><li>Relations between provided objects, for instance artistic derivation between works; </li></ul></ul><ul><ul><ul><li>ens:isRepresentationOf </li></ul></ul></ul><ul><ul><ul><li>ens:isNextInSequence </li></ul></ul></ul>
  52. 53. Linked data and cultural heritage?
  53. 54. The case for linked data in cultural heritage <ul><li>Not just a more sophisticated way to represent data! </li></ul><ul><li>Ease of getting data from external sources </li></ul><ul><ul><li>Just going to the URI and fetch the RDF there </li></ul></ul><ul><li>Ease of publishing data </li></ul><ul><ul><li>Linked data as a dissemination channel for Europeana data </li></ul></ul><ul><li>Ease of linking across datasets </li></ul><ul><ul><li>Linked data as a dissemination channel for Europeana data </li></ul></ul><ul><li>Object identification as cornerstone </li></ul><ul><ul><li>Records are just a side feature! </li></ul></ul>
  54. 55. <ul><li>From a movement supported by researchers </li></ul><ul><li>To much wider awareness </li></ul><ul><ul><li>Open government initiatives, libraries… </li></ul></ul><ul><li>Continuing effort: show benefits of collaborating to a cultural heritage data web </li></ul><ul><li>Library Linked Data W3C incubator </li></ul><ul><ul><li>http://www.w3.org/2005/Incubator/lld </li></ul></ul>Encouraging open linked data adoption
  55. 56. Linked Library Cloud beginning 2008 [Ross Singer, Code4Lib2010] http://code4lib.org/conference/2010/singer
  56. 57. Linked Library Cloud mid-2010 <ul><li>Plus: </li></ul><ul><li>Germany NL </li></ul><ul><li>Hungary NL </li></ul><ul><li>STW </li></ul><ul><li>GEMET </li></ul><ul><li>NYT </li></ul><ul><li>Agrovoc </li></ul>[Ross Singer, Code4Lib2010] http://code4lib.org/conference/2010/singer
  57. 58. Is that a surprise? <ul><li>Not really, let’s have a look at a real-world case… </li></ul>
  58. 59. Johan Stapel, Koninklijke Bibliotheek KOS & collection environment @KB
  59. 60. <ul><li>A broad range of datasets </li></ul><ul><li>That describe the same objects </li></ul><ul><li>Or related objects </li></ul><ul><li>Which are about similar subjects </li></ul><ul><li>Which were made by the same persons </li></ul><ul><li>Or related persons </li></ul><ul><li>In the same places </li></ul><ul><li>Etc… </li></ul>
  60. 61. Thanks! <ul><li>[email_address] </li></ul><ul><li>Europeana.eu team </li></ul><ul><li>Web and Media lab @ Vrije Universiteit Amsterdam </li></ul><ul><ul><li>http://wiki.cs.vu.nl/web-media </li></ul></ul><ul><li>EuropeanaConnect project </li></ul><ul><ul><li>http://www.europeanaconnect.eu/ </li></ul></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×