Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Valentine Charles: Linking cultural heritage with KOS: the Europeana example

777 views

Published on

Valentine Charles (Europeana) “Linking cultural heritage with KOS: the Europeana example”
Presentation at the KnoweScape workshop "Evolution and variation of classification systems" March 4-5, 2015 Amsterdam

Published in: Education
  • Be the first to comment

  • Be the first to like this

Valentine Charles: Linking cultural heritage with KOS: the Europeana example

  1. 1. Linking cultural heritage with KOS the Europeana example Valentine Charles Evolution and variation of classification systems – KnoweScape, Amsterdam, 05.03.2015
  2. 2. Context à  Aggregates metadata from the cultural heritage sector in Europe •  Libraries, museums, archives and audio-visual archives •  Metadata in 33 languages à  Provides a portal for users to access data and objects •  http://www.europeana.eu/ in 31 languages •  Metadata under Creative Commons Zero - public domain •  Previews and links to source à  Data distributed via •  API http://labs.europeana.eu/api/ •  Linked Data (currently being updated) http://data.europeana.eu/
  3. 3. Europeana.eu, Europe’s cultural heritage portal 40M objects from 2,200 galleries, museums, archives and libraries
  4. 4. Create a new data framework for richer metadata à  Europeana Data Model (EDM) •  Re-uses several existing Semantic Web-based models: Dublin Core, OAI-ORE, SKOS, CIDOC-CRM… •  More granular metadata •  links e.g. between objects and context entities (persons, places) •  multilingual & semantic linked data for contextual resources (e.g. Concepts) à  EDM gives support for contextual resources (semantic layer)
  5. 5. Rely on KOS to solve a problem of data integration à Create a “semantic layer” on top of connected cultural heritage objects •  Include multilingual “value vocabularies” •  From Europeana’s providers or from third-party data sources
  6. 6. Contextual entities Representing (real-world) entities related to a provided object as fully fledged resources, not just strings edm:Agent foaf:name skos:altLabel rdaGr2:biographicalInformation rdaGr2:dateOfBirth skos:Concept skos:prefLabel skos:altLabel skos:broader skos:related skos:definition…. edm:TimeSpan skos:prefLabel dcterms:isPartOf edm:begin edm:end …. edm:Place wgs84_pos:lat wgs84_pos:long skos:prefLabel skos:note dcterms:isPartOf….
  7. 7. Encourage data providers to contribute their own vocabularies à Benefit from data links made at data providers’ level à Ingestion of vocabularies is made possible if the vocabularies used the data structures EDM expects •  For instance SKOS for concept à  For other vocabularies, Europeana does custom mappings
  8. 8. An example the integration of AAT URIs in EDM hourglasses@en uurglazen@nl reloj de las horas@es http://vocab.getty.edu/aat/300206197 edm:ProvidedCHO Hourglass urn:imss:instrument:401058 skos:Concept   http://vocab.getty.edu/ aat/300198626 skos:prefLabel skos:prefLabel skos:prefLabel skos:broader dc:type
  9. 9. Demo with AAT and PartagePlus vocabularies à http://www.europeana.eu/portal/search.html? query=sabliers&rows=24&qf=PROVIDER%3A%22Museo+Galileo+- +Istituto+e+Museo+di+Storia+della+Scienza%22&qt=false à  http://www.europeana.eu/portal/search.html? query=Brooch&rows=24&qf=PROVIDER%3A%22Partage+Plus %22&qt=false
  10. 10. Vocabularies currently supported by Europeana
  11. 11. Challenge #1 à Europeana needs to regularly check that vocabularies have not changed at source: •  Changes in concepts’ identifiers •  Changes in the description of concepts (which would require a new mapping)
  12. 12. Challenge #2 à  Some of the vocabularies supported by Europeana have been developed by projects •  Issue of sustainability who maintains the vocabulary when the project ends? What happens to the data?
  13. 13. Europeana also manages its own vocabulary– WWI example à  Europeana developed a series of domain specific “sub-sites” à  Europeana 1914-1918 (http://www.europeana1914-1918.eu/ ) developed its own vocabulary based on a subset of LCSH •  Terms translated in 10 languages and linked to id.loc.gov •  Published in SKOS via the OpenSkos vocabulary service
  14. 14. http://data.europeana.eu/concept/loc/sh85148236
  15. 15. Challenge #3 à  Creation of caches of existing LOD vocabularies •  Europeana needs to keep track of the updates at the vocabulary provider side. à  The enrichment done on the Europeana side lives separately from the source vocabulary.
  16. 16. Multilingual Access to Subjects (MACS) à  MACS project has produced manual and semi automatic alignments between: •  Library of Congress Subject Heading (LCSH) •  RAMEAU •  Schlagwortnormdatei (SWD) è 120,000 links created à  MACS is integrated in The European Library as links included in all bibliographic data.
  17. 17. An example of a MACS record before and after additions by The European Library : -  ARK identifiers -  LOD URIs
  18. 18. Enrichments added through MACS The subject enriched record in EDM for delivery to Europeana
  19. 19. Automatic enrichment based on KOS Goal: Contextualization which goes beyond the scope of a particular platform Object External Dataset and Vocabulary
  20. 20. Automatic enrichment process in Europeana •  Metadata fields in resource descriptions •  Selection of potential rules to match •  Matching the values of the metadata fields to values of the contextual resources •  Adding contextual links •  Selecting the values from the contextual resource •  Augmentation of the index with the labels picked from the vocabulary Analysis Linking Augmentation
  21. 21. Vocabularies selection requirements In the context of Europeana a target vocabulary should be: à  Technically available (through Linked Data or in dedicated repositories), properly documented, and in open access; à  well-connected together, e.g. equivalent elements in other vocabularies are indicated; •  Key to avoid duplication and redundancy à  Multilingual
  22. 22. Enrichment Types and Vocabularies Enrichment Type Target vocabulary Source metadata fields Places GeoNames dcterms:spatial, dc:coverage Concepts GEMET, DBpedia, dc:subject, dc:type Agents DBpedia dc:creator, dc:contributor Time Semium Time dc:date, dc:coverage, dcterms:temporal, edm:year
  23. 23. Europeana enrichment- an example
  24. 24. Challenge #4 à  A significant change change in the target vocabulary implies •  an update of the retrieved RDF files and a new deployment of the enrichment framework (and/or) •  An update of the enrichment rules
  25. 25. Challenge #5 à  Europeana data providers might also perform enrichment on their side à  Europeana has currently no mecanism to separate the (curated) links to contextual resources by data providers from (automatic) enrichments by providers.
  26. 26. Challenge #6 à  Automatic enrichment has flaws and problems •  For instance linking any print to the physical “pressure” concept because of its German “Druck” alternative label. à Incorrect enrichments lead to •  Devaluation of curated metadata •  Loss of trust from providers •  Irrelevant search results •  Bad user experiences
  27. 27. To conclude à  Europeana continues to focus on pivot vocabularies such as Wikidata, Agrovoc to improve its search and retrieval services. à  We now investigates how to use more domains specific vocabularies for dedicated services. à  We also work on the definitions of best practices and evaluation methods for enrichment •  http://pro.europeana.eu/get-involved/europeana-tech/ europeanatech-task-forces/evaluation-and-enrichments
  28. 28. Thank you Valentine Charles valentine.charles@europeana.eu
  29. 29. Toolbox Replace text and adjust size Replace text and adjust size Replace text and adjust size Replace text and adjust size Replace text and adjust size Replace text and adjust size

×