Europeana and open data


Published on

The Europeana Data Model

Published in: Technology
  • Be the first to comment

Europeana and open data

  1. 1. Europeana and Open Data Robina Clayphan Interoperability Manager, Europeana LDBC TUC meeting, 19 November, 2013
  2. 2. What is Europeana? • Europeana is a service that brings together digital content from across the cultural heritage domain in Europe • It makes the metadata freely available • It is a catalyst for change in the world of cultural heritage. • Our vision: We believe in making cultural heritage openly accessible in a digital way, to promote the exchange of ideas and information.
  3. 3., Europe’s cultural heritage portal Museums National Aggregators Regional Aggregators Archives Thematic collections Libraries - A network of participants in development and innovation - Nearly 30 million objects from 2,400 European galleries, museums, archives and libraries
  4. 4. What types of objects does Europeana give access to? Text Image Video Sound 3D
  5. 5. Europeana and open data
  6. 6. What Europeana makes available Metadata Link to digital objects online
  7. 7. Metadata (descriptive object information) Different options: Open – not fully open (but clear) – Not open Two categories of rights CC
  8. 8. The Europeana Data Model
  9. 9. EDM requirements & principles 1. Distinction between “provided objects” (painting, book, movie, etc.) and their digital representations 2. Distinction between objects and metadata records describing an object 3. Allow for multiple records for a same object, containing potentially contradictory statements about it 4. Support for objects that are composed of other objects 5. Support for contextual resources, including concepts from controlled vocabularies Richer metadata with finer granularity
  10. 10. Provide more semantics to the data Build a semantic layer on top of Cultural Heritage objects
  11. 11. EDM Classes
  12. 12. ore:Aggregation (Identifier of aggregation) edm:WebResource (Identifier of web resource) edm:ProvidedCHO (Identifier of real object) An aggregation with a provided CHO and a web resource The three core classes edm:aggregatedCHO edm:hasView
  13. 13. The Aggregation with metadata
  14. 14. Properties for the Aggregation Mandatory: edm:aggregatedCHO edm:dataProvider edm:isShownBy or edm:isShownAt edm:provider edm:rights Optional: edm:hasView edm:object dc:rights edm:ugc The aggregation represents the set of related resources about one real object contributed by one provider. It carries the metadata that is about the whole set
  15. 15. Properties for the ProvidedCHO The ProvidedCHO is the cultural heritage object which is the subject of the package of data that has been submitted to Europeana. Properties: dc:contributor, dc:coverage, dc:creator, dc:date, dc:description, dc:format, dc:identifier, dc:language, dc:publisher, dc:relation, dc:rights, dc:source,dc:subject, dc:title, dc:type, dcterms:alternative, dcterms:extent, dcterms:temporal, dcterms:medium, dcterms:created, dcterms:provenance, dcterms:issued, dcterms:conformsTo, dcterms:hasFormat, dcterms:isFormatOf, dcterms:hasVersion, dcterms:isVersionOf, dcterms:hasPart, dcterms:isPartOf, dcterms:isReferencedBy, dcterms:references, dcterms:isReplacedBy, dcterms:replaces dcterms:isRequiredBy, dcterms:requires dcterms:tableOfContents edm:isNextInSequence edm:isDerivativeOf edm:currentLocation…
  16. 16. Properties for the web resource One or more digital representations of the provided cultural heritage object. dc:description dc:format dc:rights dc:source dcterms:conformsTo dcterms:created dcterms:extent dcterms:hasPart dcterms:isFormatOf dcterms:isPartOf dcterms:issued edm:isNextInSequence edm:rights
  17. 17. EDM Classes
  18. 18. Contextual classes Representing (real-world) entities related to a provided object as fully fledged resources, not just strings edm:Agent foaf:name skos:altLabel rdaGr2:biographicalInformation rdaGr2:dateOfBirth…. skos:Concept skos:prefLabel skos:altLabel skos:broader skos:definition…. edm:TimeSpan skos:prefLabel dcterms:isPartOf edm:begin edm:end…. edm:Place wgs84_pos:lat wgs84_pos:long skos:prefLabel dcterms:isPartOf….
  19. 19. Example of a CHO with two contextual classes edm:Agent [identifier for person resource] "D arw in, C harles" edm:ProvidedCHO [identi efi r for "real" object] skos:Concept [identifier for subject resource] "E volution"@ en "É volution"@ fr "12-02-1809" "12-04-1882" dc:creator dc:subject
  20. 20. Accessing and re-using Europeana data
  21. 21. How do users access Europeana content? Europeana aims to provide content in the users’ workflow – where they want it, when they want it. User focused channels: portal, social media exports For programmers: API, search widget, semantic mark up, LOD pilot
  22. 22. Europeana’s infrastructure is open for re-use Europeana data available via  API  Search widgets  Semantic mark-up ( on portal  Linked Open Data pilot
  23. 23. Some (approximate) numbers Europeana database – 30 Million objects LOD pilot – a subset of 20 Million objects • contained nearly 1 Billion RDF explicit statements • 4 Billion once you do all the RDF reasoning (sub-properties, sub-classes, etc) in OWLIM • Ontotext has already loaded a chunk of data and is working on the update of it, in Europeana Creative.
  24. 24. Possible benchmarking queries? Queries for exploring the dataset • e.g. to generate the complete ordered list of Europeana aggregators and the data providers they gather Queries for exploring the objects • e.g. a list of works with a matching location/creator/title • Simple graph traversal Expressing EDM constraints (that cannot be done in OWL) • Can RDF validation help e.g where at least one of two properties must be present (title or description)? Queries to assist in data quality improvement • Broken links, duplicates (or near duplicates), missing mandatory properties, missing thumbnails etc etc For Information: We are starting a data quality task force if you are interested!
  25. 25. Useful links  Europeana portal  Europeana Professional • EDM documentation • Europeana API • LOD pilot  Data Quality task force –  Europeana Professional blog  Facebook  Twitter  Europeana Thought Lab  Europeana end-user blog
  26. 26. Thank you Robina Clayphan
  27. 27. Bonus slides!
  28. 28. EDM design requirements  Compatibility with different levels of description • Allow different levels of granularity • A book, a page, a detail of an image  Standard metadata format that can be specialized • Allow the specification of domain specific application profiles • Enable the re-use of existing standards • Allow the extension of the initial model
  29. 29. EDM basis  OAI ORE (Open Archives Initiative Object Reuse & Exchange) for organizing an object’s metadata and digital representation(s)  Dublin Core for descriptive metadata  SKOS (Simple Knowledge Organization System) for conceptual vocabulary representation  CIDOC-CRM for the modeling of event and relationships between objects  Use the Semantic Web representation principles • RDF • Re-use and mix different vocabularies together • Preserve original data and still allow for interoperability
  30. 30. EDM Properties (excluding ESE)
  31. 31. Two providers and two aggregations (the same object) 31 aggregation of DMF aggregation of Louvre v provenance metadata provenance metadata Cultural heritage object
  32. 32. Europeana aggregation Enriched metadata Landing page