Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Enriching Cultural Heritage Data with DBpedia

1,497 views

Published on

Presentation of automatic enrichment at Europeana with DBpedia, at the DBpedia community meeting 2016, http://wiki.dbpedia.org/meetings/TheHague2016

Published in: Technology
  • Be the first to comment

Enriching Cultural Heritage Data with DBpedia

  1. 1. Enriching Cultural Heritage Data with DBpedia Antoine Isaac | DBpedia Community Meeting 2016 Netherlands, Public Domain 1660 - 1625, Rijksmuseum Anonymous Arrival of a Portuguese ship
  2. 2. Title here CC BY-SA Europeana? Europeana Essentials CC BY-SA Enriching Cultural Heritage Data with DBpedia CC BY-SA Europeana Collections homepage Europeana| CC BY-SA
  3. 3. Title here CC BY-SA Title here CC BY-SA Europeana Essentials CC BY-SA Enriching Cultural Heritage Data with DBpedia CC BY-SA Europeana aggregation infrastructure Europeana| CC BY-SA Europeana?
  4. 4. Europeana has many data challenges Enriching Cultural Heritage Data with DBpedia CC BY-SA We aggregate very heterogeneous metadata • More than 48M objects • 3,500 galleries, libraries, archives and museums • 50 languages • From all EU countries • Level of quality varies greatly
  5. 5. Title here CC BY-SA Title here CC BY-SA Enriching Cultural Heritage Data with DBpedia CC BY-SA Linked Open Data Europeana Linked Open Data video on Vimeo Europeana | CC BY-SA
  6. 6. Europeana Linked Data Strategy Our efforts and lines of work Enriching Cultural Heritage Data with DBpedia CC BY-SA • The Europeana Data Model (EDM) offers a way to represent richer (linked) data • We apply an enrichment strategy to link source data to reference data, including DBpedia Will be discussed in Parallel Session 2: • We encourage data providers to contribute links between objects and (their own) vocabularies • We encourage alignment activities between domain vocabularies
  7. 7. Title here CC BY-SA Title here CC BY-SA Europeana Essentials CC BY-SA The Europeana Data Model Enriching Cultural Heritage Data with DBpedia CC BY-SA Clavecin, Bartolomeo Cristofori Cite de la Musique, MIMO - Musical Instruments Museums Online|CC BY-NC-SA Europeana Data Model example Europeana| CC BY-SA
  8. 8. Title here CC BY-SA Title here CC BY-SA Europeana Essentials CC BY-SA Create a “semantic layer” on top of cultural heritage objects Enriching Cultural Heritage Data with DBpedia CC BY-SA Include multilingual “value vocabularies” (e.g. thesauri represented SKOS) from Europeana’s providers or from third-party data sources
  9. 9. Semantic enrichment, a solution for better quality data? Automatic and manual enrichment are more and more commonly used in digital libraries to: • normalise data • “standardize data” by linking it to authority resources • improve multilingual coverage in datasets • contextualise resources Enriching Cultural Heritage Data with DBpedia CC BY-SA
  10. 10. The main components of semantic enrichment CC BY-SA source objects whose metadata is being enriched set of resources used to enrich the source metadata targets can be of different types, from simple uncontrolled strings to resources published as LOD specify how the enrichment between the source and target should be executed. Source Target Rules Enriching Cultural Heritage Data with DBpedia
  11. 11. Automatic enrichment process in Europeana CC BY-SA selection of metadata fields in descriptions selection of potential rules to match matching the values of the metadata fields to values of the contextual resources adding contextual links selection of values from the contextual resource values go into the search index Analysis Linking Augmentation of search index Enriching Cultural Heritage Data with DBpedia
  12. 12. CC BY-SA Enriching Cultural Heritage Data with DBpedia
  13. 13. Vocabularies we currently enrich metadata with CC BY-SA Enriching Cultural Heritage Data with DBpedia Entity Class Target vocabulary Size Metadata Fields subject of Enrichment Places GeoNames 140,097 dcterms:spatial, dc:coverage Concepts DBpedia 5,284 dc:subject, dc:type GEMET 280 Agents DBpedia 161,209 dc:creator, dc:contributor Time Semium Time 2,566 dc:coverage, dcterms:temporal, dc:date, edm:year
  14. 14. Why DBpedia? CC BY-SA Building an ecosystem of networked references • It offers labels in about 124 languages through all its language editions of which 48 match the languages that Europeana supports • It gives fairly complete and accurate descriptive metadata about entities • Works great as a “pivot” vocabulary, providing further links to other vocabularies such as Wikidata and Freebase
  15. 15. Not everything is perfect France, Public Domain 1921, National Library of France Agence de presse Meurisse Colombes : championnats de France d’Athlétisme : rivière, le speaker
  16. 16. Challenges of multilingual automatic enrichment Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments Poisonous India or the Importance of a Semantic and Multilingual Enrichment Strategy Marlies Olensky, Juliane Stiller, Evelyn Dröge, MTSR 2012 http://link.springer.com/chapter/10.1007%2F978-3-642-35233- 1_25
  17. 17. Comparative evaluation of enrichments CC BY-SA Enriching Cultural Heritage Data with DBpedia We ran a quantitative evaluation on a sample set enriched by 7 different tools (settings) http://pro.europeana.eu/taskforce/evaluation-and-enrichments
  18. 18. Example of Recommendations that will be explored CC BY-SA Enriching Cultural Heritage Data with DBpedia Define your enrichment goals • Develop better criteria for evaluating enrichment Choose the right service • enrichment tool more aware of the semantics of the model Monitor your enrichment process and re-assess • target dataset could be richer: new terms, new languages, more granular Enrichment using a better reference for contextual entities? You will hear about this in the next session ☺
  19. 19. Title here CC BY-SA Name of image | Creator Providing organization| Country, licence Name of image | Creator Providing organization| Country, licence With slides from Valentine Charles, Juliane Stiller, Hugo Manguinhas and Stefan Gradmann

×