Europeana and open data
Upcoming SlideShare
Loading in...5

Europeana and open data



The Europeana Data Model

The Europeana Data Model



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • Structured as a Network of partici[ants in the deve and innovation work. <br /> At a working level, we operate in a network of aggregators. Aggregators are important because they share a background with the organisations whose content they bring together, so there is close understanding.The aggregation model enables Europeana to collect huge quantities of data from thousands of providers, through only a handful of channels. <br />
  • Les Miserables: Victor Hugo’s handwritten manuscripts: <br /> BnF, public domain <br /> Matisse ‘53 in the form of a double helix’ Wellcome Library (CC-BY-NC-ND) <br /> ‘söprűtánc’ – Hungarian traditional dance <br /> Hungarian Academy of Sciences Institute for Musicology, public domain <br /> ‘Neurologico reggae’ Music album <br /> DISMARC – EuropeanaConnect Paid Access <br /> ‘Castle of Kavala’ 3D exploration of a Greek castle <br /> Cultural and Educational Technology Institute - Research Centre Athen CARARE CC-BY-NC-ND <br />
  • Example used is: <br /> <br /> Een vrouw met een kind in een kelderkamer by Pieter de Hooch, Rijksmuseum, public domain <br />
  • We had seven reqs (these are 5). We had started with a flat dc typle metadata standard which was a comon demoninator for all the different practicsa and standards of the providers. Now we were moving on we wanted a more sophisticated model that allowed us to <br /> Accommodate differents data models with differents structures <br /> Accommodate domain specific requirements <br /> Keep the semantics of the original data <br />
  • Semantic layer provides more context to the object <br /> Links to related entities (people, places etc) <br /> Allow the representation of other specific relationships <br /> “Aboutness” of the object <br /> Similarities <br /> more general links such as general part-whole relation, citation, direct versioning links <br />
  • Red -&gt; for providers and Europeana <br /> Green -&gt; for Europeana – to allow for duplication and enrichment <br />
  • This diagram shows the three core classes and the relationship between them. <br /> The Provided CHO is the “real Thing” as it exists in the real world – the mona lisa for example. <br /> The Web Resource is the digital representation of the providedCHO and is the resource that is accessible from europeana <br /> The aggregation is the construct that links these objects to make a logical whole. <br /> Each of these has a defined set of properteis that can be attached to them. <br />
  • Properties that relate to the aggregation – notably the data about where the data comes from and the identifers of the real thing and its digital representations. <br />
  • Has the most descriptive properties ( is backward compatible with ESE) - many dc properties and more EDM ones for describe the object and its relationships to other entities. <br /> Some mandatory elements – DQ task force focussing on this. <br /> dc:title or dc:description <br /> dc:language for text objects <br /> dc:subject or dc:type or dc:coverage or dcterms:spatial <br /> edm:type <br />
  • Red -&gt; for providers and Europeana <br /> Green -&gt; for Europeana – to allow for duplication and enrichment <br />
  • Poisonous nature exhibition includes content from Europeana, <br /> Europeana Fashion portal will go live in May 2013 <br />
  • Data is available as ‘data dumps’ for Linked Open Data initiatives from <br /> Europeana&apos;s move to CC0 is a step change in open data access. Releasing data from across the memory organisations of every EU country sets an important new international precedent, a decisive move away from the world of closed and controlled data. <br /> Note that previews can only be used in accordance with the rights information displayed next to them. <br /> HISPANA and Partage Plus both use the Europeana API to include Europeana search results on their own websites <br />
  • These are the final tow requriments - <br />
  • And here we have both examples - two providers of the same object but with different metadata. So there are two aggregations about the same object and the concept of a proxy is intorduced in order to keep the different sets of metadata apart. <br /> Proxies not something that providers necessarily need to worry about but it si something we in Europeana need to do to fulfil the requirements to allow metadata to from different provider to co-exist for the same CHO. It is also an entity that will allow Europeana to enrich data by creating our own metadata based on provider metadata. This is quite importan for librries as there are sets of data out there that can be used – for example – VIAF for name authorities. By creating our won proxy, with our own metadata we can add these links to the provided metadata <br />
  • Nightmare slide – the euroepana aggregation aggregates both providers aggregations. We also have our own proxy with our enhanced metadta added – here shown using viaf as a unique identifier and skos to give two language versions of the creator name. <br />

Europeana and open data Europeana and open data Presentation Transcript

  • Europeana and Open Data Robina Clayphan Interoperability Manager, Europeana LDBC TUC meeting, 19 November, 2013
  • What is Europeana? • Europeana is a service that brings together digital content from across the cultural heritage domain in Europe • It makes the metadata freely available • It is a catalyst for change in the world of cultural heritage. • Our vision: We believe in making cultural heritage openly accessible in a digital way, to promote the exchange of ideas and information.
  •, Europe’s cultural heritage portal Museums National Aggregators Regional Aggregators Archives Thematic collections Libraries - A network of participants in development and innovation - Nearly 30 million objects from 2,400 European galleries, museums, archives and libraries
  • What types of objects does Europeana give access to? Text Image Video Sound 3D
  • Europeana and open data
  • What Europeana makes available Metadata Link to digital objects online
  • Metadata (descriptive object information) Different options: Open – not fully open (but clear) – Not open Two categories of rights CC
  • The Europeana Data Model
  • EDM requirements & principles 1. Distinction between “provided objects” (painting, book, movie, etc.) and their digital representations 2. Distinction between objects and metadata records describing an object 3. Allow for multiple records for a same object, containing potentially contradictory statements about it 4. Support for objects that are composed of other objects 5. Support for contextual resources, including concepts from controlled vocabularies Richer metadata with finer granularity
  • Provide more semantics to the data Build a semantic layer on top of Cultural Heritage objects
  • EDM Classes
  • ore:Aggregation (Identifier of aggregation) edm:WebResource (Identifier of web resource) edm:ProvidedCHO (Identifier of real object) An aggregation with a provided CHO and a web resource The three core classes edm:aggregatedCHO edm:hasView
  • The Aggregation with metadata
  • Properties for the Aggregation Mandatory: edm:aggregatedCHO edm:dataProvider edm:isShownBy or edm:isShownAt edm:provider edm:rights Optional: edm:hasView edm:object dc:rights edm:ugc The aggregation represents the set of related resources about one real object contributed by one provider. It carries the metadata that is about the whole set
  • Properties for the ProvidedCHO The ProvidedCHO is the cultural heritage object which is the subject of the package of data that has been submitted to Europeana. Properties: dc:contributor, dc:coverage, dc:creator, dc:date, dc:description, dc:format, dc:identifier, dc:language, dc:publisher, dc:relation, dc:rights, dc:source,dc:subject, dc:title, dc:type, dcterms:alternative, dcterms:extent, dcterms:temporal, dcterms:medium, dcterms:created, dcterms:provenance, dcterms:issued, dcterms:conformsTo, dcterms:hasFormat, dcterms:isFormatOf, dcterms:hasVersion, dcterms:isVersionOf, dcterms:hasPart, dcterms:isPartOf, dcterms:isReferencedBy, dcterms:references, dcterms:isReplacedBy, dcterms:replaces dcterms:isRequiredBy, dcterms:requires dcterms:tableOfContents edm:isNextInSequence edm:isDerivativeOf edm:currentLocation…
  • Properties for the web resource One or more digital representations of the provided cultural heritage object. dc:description dc:format dc:rights dc:source dcterms:conformsTo dcterms:created dcterms:extent dcterms:hasPart dcterms:isFormatOf dcterms:isPartOf dcterms:issued edm:isNextInSequence edm:rights
  • EDM Classes
  • Contextual classes Representing (real-world) entities related to a provided object as fully fledged resources, not just strings edm:Agent foaf:name skos:altLabel rdaGr2:biographicalInformation rdaGr2:dateOfBirth…. skos:Concept skos:prefLabel skos:altLabel skos:broader skos:definition…. edm:TimeSpan skos:prefLabel dcterms:isPartOf edm:begin edm:end…. edm:Place wgs84_pos:lat wgs84_pos:long skos:prefLabel dcterms:isPartOf….
  • Example of a CHO with two contextual classes edm:Agent [identifier for person resource] "D arw in, C harles" edm:ProvidedCHO [identi efi r for "real" object] skos:Concept [identifier for subject resource] "E volution"@ en "É volution"@ fr "12-02-1809" "12-04-1882" dc:creator dc:subject
  • Accessing and re-using Europeana data
  • How do users access Europeana content? Europeana aims to provide content in the users’ workflow – where they want it, when they want it. User focused channels: portal, social media exports For programmers: API, search widget, semantic mark up, LOD pilot
  • Europeana’s infrastructure is open for re-use Europeana data available via  API  Search widgets  Semantic mark-up ( on portal  Linked Open Data pilot
  • Some (approximate) numbers Europeana database – 30 Million objects LOD pilot – a subset of 20 Million objects • contained nearly 1 Billion RDF explicit statements • 4 Billion once you do all the RDF reasoning (sub-properties, sub-classes, etc) in OWLIM • Ontotext has already loaded a chunk of data and is working on the update of it, in Europeana Creative.
  • Possible benchmarking queries? Queries for exploring the dataset • e.g. to generate the complete ordered list of Europeana aggregators and the data providers they gather Queries for exploring the objects • e.g. a list of works with a matching location/creator/title • Simple graph traversal Expressing EDM constraints (that cannot be done in OWL) • Can RDF validation help e.g where at least one of two properties must be present (title or description)? Queries to assist in data quality improvement • Broken links, duplicates (or near duplicates), missing mandatory properties, missing thumbnails etc etc For Information: We are starting a data quality task force if you are interested!
  • Useful links  Europeana portal  Europeana Professional • EDM documentation • Europeana API • LOD pilot  Data Quality task force –  Europeana Professional blog  Facebook  Twitter  Europeana Thought Lab  Europeana end-user blog
  • Thank you Robina Clayphan
  • Bonus slides!
  • EDM design requirements  Compatibility with different levels of description • Allow different levels of granularity • A book, a page, a detail of an image  Standard metadata format that can be specialized • Allow the specification of domain specific application profiles • Enable the re-use of existing standards • Allow the extension of the initial model
  • EDM basis  OAI ORE (Open Archives Initiative Object Reuse & Exchange) for organizing an object’s metadata and digital representation(s)  Dublin Core for descriptive metadata  SKOS (Simple Knowledge Organization System) for conceptual vocabulary representation  CIDOC-CRM for the modeling of event and relationships between objects  Use the Semantic Web representation principles • RDF • Re-use and mix different vocabularies together • Preserve original data and still allow for interoperability
  • EDM Properties (excluding ESE)
  • Two providers and two aggregations (the same object) 31 aggregation of DMF aggregation of Louvre v provenance metadata provenance metadata Cultural heritage object
  • Europeana aggregation Enriched metadata Landing page