• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
The Europeana Datamodel: A semantic layer on top of Cultural Heritage Objects

The Europeana Datamodel: A semantic layer on top of Cultural Heritage Objects



Deze presentatie werd gegeven op 20/12/2013 in het kader van het 30-jarig bestaan van IBW. Presentatie van Roxanne Wyns, Businesss Consultant bij LIBIS.

Deze presentatie werd gegeven op 20/12/2013 in het kader van het 30-jarig bestaan van IBW. Presentatie van Roxanne Wyns, Businesss Consultant bij LIBIS.



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds


Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    The Europeana Datamodel: A semantic layer on top of Cultural Heritage Objects The Europeana Datamodel: A semantic layer on top of Cultural Heritage Objects Document Transcript

    • 28-3-2014 1 The Europeana Data Model A semantic layer on top of Cultural Heritage Objects 20-12-2013 1Roxanne Wyns - IBW Roxanne Wyns – LIBIS, KU Leuven Roxanne.Wyns@libis.kuleuven.be Content Overview • Background • Europeana – From Portal to Platform • The Europeana Data Model & The Semantic Web • EDM basic pattern • Some applications… • Semantic enrichment @ Europeana • Some inspiring examples… 20-12-2013Roxanne Wyns - IBW 2
    • 28-3-2014 2 Background LIBIS • Library automation service of the KU Leuven • Support scientific and public organisations in managing their library, archival and museum collections • Center of knowledge & expertise • Expertise in interoperability standards, semantic enrichment and multilingualism • European Best Practice research projects (ICT-PSP) • Member of several standards and enrichment working groups 20-12-2013Roxanne Wyns - IBW 3 Before we move to EDM… 20-12-2013Roxanne Wyns - IBW 4
    • 28-3-2014 3 Europeana What is Europeana? • An internet portal that acts as an interface to millions of books, paintings, films, museum objects and archival records that have been digitised throughout Europe • A platform for knowledge exchange that promotes collaboration between librarians, curators, archivists and the creative industries • A platform for access and reuse of cultural content by creative industries, research and education (Europeana API) 20-12-2013Roxanne Wyns - IBW 5 Europeana as a portal 20-12-2013Roxanne Wyns - IBW 6
    • 28-3-2014 4 20-12-2013Roxanne Wyns - IBW 7 www.europeana.eu From portal… [1] From portal to platform… Aggregation by Europeana • Gathering digital content from cultural organisations • Cross-collection and cross-sectoral (archives, libraries, museums) • Bring this together on the web by using standardised file and metadata formats (ESE) • To facilitate resource discovery! • Quantity vs. Quality 20-12-2013Roxanne Wyns - IBW 8 From portal… [2]
    • 28-3-2014 5 A variety of aggregation models • Project aggregation (ICT-PSP) – Dark aggregators – Aggregators + portal – Domain, thematic, cross-domain • National or regional aggregation – Usually with a portal – Europeana not the only source – Domain, thematic, cross-domain • Institutions 20-12-2013Roxanne Wyns - IBW 9 From portal… [3] The European Library 20-12-2013Roxanne Wyns - IBW 10 From portal… [4]
    • 28-3-2014 6 Delivering content • Source > [Intermediate] > Target • Source = in-house or standard • Intermediate = standard format or adaptation of standard • Target = ESE • Protocols, tools and formats – XML or CSV – HTTP, FTP or OAI-PMH upload (provided by aggregator) – Ingestion and mapping tools (provided by aggregator) – OAI-PMH (Europeana) – … 20-12-2013Roxanne Wyns - IBW 11 From portal… [5] From portal… [5] ESE - Europeana semantic elements – Represents lowest common denominator for object metadata – Basically Dublin Core with some extra Europeana elements – Forces interoperability – Convert datasets to a “flat” data representation – Loss of richness of the original data – Not adequate to accommodate domain specific requirements – One digital representation per object record http://pro.europeana.eu/ese-documentation 20-12-2013Roxanne Wyns - IBW 12
    • 28-3-2014 7 Moving towards a platform 20-12-2013Roxanne Wyns - IBW 13 ESE XML EDM RDF A change of strategy?! • Europeana Strategic Plan 2011 – 2015 – Aggregate: Europeana as a trusted source for European Cultural Content – Facilitate: Support the Cultural Heritage sector through knowledge transfer – Distribute: Make Heritage available to everyone, everywhere, at every moment – Engage: Find new ways for people to participate in culture • New Renaissance Report • Focus on open data and reuse of content – Linked Open Data, using Semantic Web Technologies – Data Exchange Agreement (DEA) 20-12-2013Roxanne Wyns - IBW 14 … to platform [1]
    • 28-3-2014 8 Transition into 2014 – Why Europeana and not Google?! • Shift from portal to platform • Data quality emphasis • More data providers • Value creation for contributing partners (aggregators & content partners) • Copyright improvement • Multilingual • Thematic focus (e.g. Europeana 1914 – 1918) Europeana Business Plan 2014 (http://pro.europeana.eu/) 20-12-2013Roxanne Wyns - IBW 15 … to platform [2] 20-12-2013Roxanne Wyns - IBW 16 … to platform [3] Positioning Europeana in research, education & creative industries (tourism, social media, UGC…) CEF funding www.europeana1914-1918.eu/
    • 28-3-2014 9 Focusing on data quality • Introduce star rating system • Rights statement increase reuse potential • Decent previews • Persistent links • Metadata records in EDM! data interoperability, semantic enrichment, multilingualism… 20-12-2013Roxanne Wyns - IBW 17 … to platform [4] Delivering content • National aggregators! • Source > [Intermediate] > Target • Source = [in-house] or standard • Intermediate = EDM extension or application • Target = EDM RDF • Protocols, tools and formats – XML, CSV, API? – HTTP, FTP or OAI-PMH upload (provided by aggregator) – Ingestion and mapping tools (provided by aggregator, part of CMS) – OAI-PMH, SWORD? (Europeana) – … 20-12-2013Roxanne Wyns - IBW 18 … to platform [5]
    • 28-3-2014 10 Europeana Inside aggregation 20-12-2013Roxanne Wyns - IBW 19 … to platform [5] http://www.europeana-inside.eu/ 20-12-2013Roxanne Wyns - IBW 20 … to platform [7] Europeana API http://www.europeana.eu/portal/api/console.html
    • 28-3-2014 11 20-12-2013Roxanne Wyns - IBW 21 … to platform [8] Europeana LOD pilot http://europeana.ontotext.com/ The Europeana Data Model (EDM) & The Semantic Web 20-12-2013Roxanne Wyns - IBW 22
    • 28-3-2014 12 EDM & The Semantic Web [1] Moving towards the Europeana Data Model • Based on best practices from the different GLAM domains • Align the data model to the specific community concerns • Providing different levels of granularity • Enable the re-use of existing standards • Providing the possibility to build domain or sector specific application profiles on EDM 20-12-2013Roxanne Wyns - IBW 23 EDM & The Semantic Web [2] EDM requirements • Richer metadata - finer granularity • Distinguish “provided objects” (painting, book, movie, etc.) from their digital representations • Distinguish object from its metadata record • Allow multiple records for the same object, containing potentially contradictory statements about it • Support for objects that are composed of other objects • Support for contextual resources, including concepts from controlled vocabularies 20-12-2013Roxanne Wyns - IBW 24 Introduction to the Europeana Data Model (EDM) (http://pro.europeana.eu/)
    • 28-3-2014 13 EDM & The Semantic Web [3] A semantic layer on top of Cultural Heritage Objects • Provides more context to the metadata • Allows the representation of specific relationships – Similarities between objects – Relationships – Representations – Derivations Goal: to make data available as Linked Open Data for re-use by external sources 20-12-2013Roxanne Wyns - IBW 25 20-12-2013Roxanne Wyns - IBW 26 EDM & The Semantic Web [4] The Semantic Web Linking Open Data cloud diagram: From Wikimedia Commons ; http://lod-cloud.net/
    • 28-3-2014 14 20-12-2013Roxanne Wyns - IBW 27 EDM & The Semantic Web [5] The Semantic web or Web 3.0 • Concept introduced by Sir Tim Berners-Lee (W3C) • Proposed as the solution for the current problems in sharing and retrieving relevant data on the current Web where: - Content is not well structured, has inexplicit semantics, is not interoperable (HTML, URLs to link) - Expressive questions cannot be asked by the user - Multiple data queries, human interpretation and knowledge is needed to retrieve relevant and “complete” results Moving from documents to data The Semantic Web is an extension of the current Web • It includes semantic information (context and meaning!) in web pages • This meaning allows both people and machines to better interpret the data • It creates links so that a person or machine can explore the web of ”related” data via these links • These links are at the heart of the Semantic web and are needed for integration and reasoning of data on the Web = Linked Data 20-12-2013Roxanne Wyns - IBW 28 EDM & The Semantic Web [6]
    • 28-3-2014 15 Linked Data principles 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names 3. When someone looks up a URI, provide useful RDF information 4. Include RDF statements that link to other URIs so that they can discover related things Tim Berners-Lee 2007 – http://www.w3.org/DesignIssues/LinkedData.html 20-12-2013Roxanne Wyns - IBW 29 EDM & The Semantic Web [7] EDM & The Semantic Web [8] Focus shift • In the web of documents we have HTTP URIs identifying resources and links between them, but without context: – What kinds of resources are 'Louvre.html' and 'LaJoconde.jpg'? – A machine cannot tell, humans can Towards a Semantic Research Library. Prof. Dr. Stefan Gradmann (KU Leuven) 20-12-2013Roxanne Wyns - IBW 30
    • 28-3-2014 16 Towards a Semantic Research Library. Prof. Dr. Stefan Gradmann (KU Leuven) EDM & The Semantic Web [9] • So we add syntax for making statements on the resource using RDF triples and a schema language (RDFS) • Extending the Web into the ‘Web of Things’ 20-12-2013Roxanne Wyns - IBW 31 EDM & The Semantic Web [10] EDM basis • OAI ORE for organizing an object’s metadata and digital representation(s) • Dublin Core for descriptive metadata • SKOS for conceptual vocabulary representation • CIDOC-CRM for event and relationships between objects • RDF for Semantic Web representation 20-12-2013Roxanne Wyns - IBW 32
    • 28-3-2014 17 EDM & The Semantic Web [11] OAI ORE • Open Archives Initiative Object Reuse & Exchange • Defines standards for the description and exchange of aggregations of Web resources • For combining distributed resources with multiple media types (text, images, video, data…) A “bundle” of an object and its digital representation(s) 20-12-2013Roxanne Wyns - IBW 33 EDM & The Semantic Web [12] Dublin Core + extra EDM elements • For the descriptive metadata of a cultural heritage object • edm:ProvidedCHO is the cultural heritage object which is the subject of the package of data delivered to Europeana Properties: dc:contributor, dc:creator, dc:date, dc:format, dc:identifier, dc:language, dc:publisher, dc:relation, dc:source, dcterms:alternative, dcterms:extent, dcterms:temporal, dcterms:medium, dcterms:created, dcterms:provenance, dcterms:issued, dcterms:conformsTo, dcterms:hasFormat, dcterms:isFormatOf, dcterms:hasVersion, dcterms:isVersionOf, dcterms:hasPart, dcterms:isPartOf, dcterms:isReferencedBy, dcterms:references, dcterms:isReplacedBy, dcterms:replaces dcterms:isRequiredBy, dcterms:requires, dcterms:tableOfContents, edm:isNextInSequence, edm:isDerivativeOf, edm:currentLocation… 20-12-2013Roxanne Wyns - IBW 34
    • 28-3-2014 18 Simple Knowledge Organisation System • Solution for converting a “classic” thesaurus or vocabulary managed into a semantically interoperable format • Based on the RDF specification • Ideal for creating multilingual networks of terminologies • Structured according to the ISO 25964 norm • Components Concepts Documented URIs Semantically related (BT, NT, RT) Labelled Concept schemes 20-12-2013Roxanne Wyns - IBW 35 EDM & The Semantic Web [13] 20-12-2013Roxanne Wyns - IBW 36 EDM & The Semantic Web [14]
    • 28-3-2014 19 Vocabularies play an important role in the Semantic Web and Linked Data world • They are the basic building blocks for linking data • They help with the interpretation and integration of data between different datasets • And so may lead to the discovery of new relationships between information expressed in a different natural language 20-12-2013Roxanne Wyns - IBW 37 EDM & The Semantic Web [15] EDM & The Semantic Web [16] CIDOC – Conceptual Reference Model (ISO 21127) • A formal domain ontology for cultural heritage information • Describes the things that the cultural heritage sector deals with and how these things relate to each other • Expressed as an “object-oriented” schema • An object is described according to a series of event that took place in its lifetime – When – Where – Who – What 20-12-2013Roxanne Wyns - IBW 38
    • 28-3-2014 20 20-12-2013Roxanne Wyns - IBW 39 CIDOC-CRM events EDM & The Semantic Web [17] Resource Description Framework (RDF) • Forms the basis of Semantic web technologies • Universal language to describe the characteristics of resource on the web • Using XML for syntax and URIs for naming • Makes statements about resources in the form of subject-predicate- object triples • RDF triples provides a labelled connection using URIs to make it possible to link data with one another • In this way a machine is able to find the semantic relations between data 20-12-2013Roxanne Wyns - IBW 40 EDM & The Semantic Web [18]
    • 28-3-2014 21 • The different parts of a triple are – Subject – the thing being described – Predicate – a trait, aspect, or property of the thing, which expresses a relationship between the subject and object – Object – the thing that is the value of the predicate (trait, aspect or property) of the object thing • So in the statement “Mona Lisa was created by Da Vinci” – Subject – Mona Lisa (La Joconde) – Predicate – Created by – Object – Da Vinci • In terms of representation: – Subject – must be a URI – Predicate – must be a URI – Object – may be a URI or a constant value or “literal‟ 20-12-2013Roxanne Wyns - IBW 41 EDM & The Semantic Web [17] KU Leuven University Flanders is a Located in EDM & The Semantic Web [18] EDM application 20-12-2013Roxanne Wyns - IBW 42
    • 28-3-2014 22 EDM basic pattern 20-12-2013Roxanne Wyns - IBW 43 EDM basic pattern [1] • A data provider submits to Europeana a “bundle” of an object and its digital representation(s) 20-12-2013Roxanne Wyns - IBW 44
    • 28-3-2014 23 EDM basic pattern [2] 20-12-2013Roxanne Wyns - IBW 45 Musical Instruments Museums Online (http://www.mimo-db.eu/), Rodolphe Bailly Using DC as the basis for ProvidedCHO • Advantages – Wide spread – Simple – Stable – Cross-domain • Disadvantages – Not rich enough – Lack of structure – No differentiation between the object itself and its digital surrogate (e.g. creator, photographer dc:creator) – Loss of relationships between different classes of data and events (no relation between who, where, what, when) 20-12-2013Roxanne Wyns - IBW 46 EDM basic pattern [3]
    • 28-3-2014 24 “Proxies” • Describing the provided object as seen from the perspective of a specific provider • Used for – Connecting duplicates of cultural heritage object descriptions coming from different providers, each with its own metadata – For adding Europeana enrichments about a resource – Keeping each provider’s metadata distinct – And keeping Europeana metadata distinct from the providers’ metadata 20-12-2013Roxanne Wyns - IBW 47 EDM basic pattern [4] 20-12-2013Roxanne Wyns - IBW 48 EDM basic pattern [5] aggregation of DMF aggregation of Louvre Introduction to the Europeana Data Model (EDM) (http://pro.europeana.eu/)
    • 28-3-2014 25 20-12-2013Roxanne Wyns - IBW 49 EDM basic pattern [6] Introduction to the Europeana Data Model (EDM) (http://pro.europeana.eu/) Hierarchical objects Let’s have a look at some applications… 20-12-2013Roxanne Wyns - IBW 50
    • 28-3-2014 26 PartagePlus record provided as EDM 20-12-2013Roxanne Wyns - IBW 51 EDM records [1] Mapping LIDO2EDM • Respect for the actual specifications of both models in order to ensure semantic validity of the resulting EDM • Only a subset of the (core) LIDO elements are mapped • When value starts with 'http://' or 'https://' it becomes an 'rdf:resource' in the EDM record, otherwise it is included as a literal • In addition an EDM property with the preferred label for the concept or agent in the language of the metadata records as literal is created 20-12-2013Roxanne Wyns - IBW 52 EDM records [2]
    • 28-3-2014 27 Mapping LIDO2EDM • Qualifying information for agents (dc:creator, dc:contributor), dates (dc:date), places (dcterms:spatial) is lost • LIDO-based ingestions would benefit from a full implementation of the EDM model 20-12-2013Roxanne Wyns - IBW 53 EDM records [3] 20-12-2013Roxanne Wyns - IBW 54 EDM records [4]
    • 28-3-2014 28 Semantic enrichment @ Europeana Opportunities and pitfalls 20-12-2013Roxanne Wyns - IBW 55 Semantic tagging • Using the AnnoCultor tool (http://semium.org) – Interprets values – Searches for corresponding terms in specialised vocabularies – Adds links to matching terms (dcterms:spatial = Venise link to place: http://sws.geonames.org/3164603/ – Pulls in additional information about this record (βενετία, velence, венеция, venice, etc.) Semantic enrichment @ Europeana [1] 20-12-2013Roxanne Wyns - IBW 56
    • 28-3-2014 29 Semantic enrichment @ Europeana [2] Enriched elements • Place enrichment (edm_place:*) – Subset of GeoNames (www.geonames.org) – Limited to European geographic locations – Limiting on prefixes "A", "P.PPL", "S.CSTL", "S.ANS", "S.MNMT", "S.LIBR", "S.HSTS", "S.OPRA", "S.AMTH", "S.TMPL", "T.ISL“ (http://www.geonames.org/statistics/total.html) – Enrichment limited to EDM fields “dcterms:spatial” and “dc:coverage” – Enrichment rules: exact matching? – Result: 5.8M objects enriched, provides multilingual search on places http://europeana.eu/portal/search.html?query=edm_place%3A* Issues? – Appear to be limited – But only places in Europe are enriched – And only for the geographical coverage EDM elements 20-12-2013Roxanne Wyns - IBW 57 Semantic enrichment @ Europeana [3] • Concept (topic) enrichment – Using GEMET thesaurus (http://www.eionet.europa.eu/gemet/) – 12 concepts removed to avoid linking with homonyms (e.g. Druck) – Some WWI battles and the two categories “World War I” and “art” from are taken from Dbpedia – Enrichment limited to EDM fields “dc:subject” and “dc:type” – Enrichment rules: exact matching? – Result: 9M objects enriched, http://www.europeana.eu/portal/search.html?query=skos_concept%3A* Issues? – Exact matching not limited to the language of the record (Dutch “Tegel” mapped to the Swedish “Tegel”, meaning brick) – No suitable multilingual concept thesauri for the cultural domain drawing – Noise because of metadata quality (dc:type “photo”, “book”, “video”,…) 20-12-2013Roxanne Wyns - IBW 58
    • 28-3-2014 30 Semantic enrichment @ Europeana [4] 20-12-2013Roxanne Wyns - IBW 59 Semantic enrichment @ Europeana [5] 20-12-2013Roxanne Wyns - IBW 60
    • 28-3-2014 31 Semantic enrichment @ Europeana [6] • Agent (person) enrichment – Small set of artists (painters) from Dbpedia – Enrichment limited to EDM fields “dc:creator” and “dc:contributor” – Enrichment rules: exact matching? – Result: 136K objects enriched http://www.europeana.eu/portal/search.html?query=edm_agent%3A* Issues? – Quality or structure of provided metadata 20-12-2013Roxanne Wyns - IBW 61 Semantic enrichment @ Europeana [7] • Time period enrichment – Using Semium time periods vocabulary (http://semium.org/time/) – Partly automatically generated (3rd quarter of 15th century) / manually generated (Roman empire) – Enrichment limited to EDM fields:dc:date, dc:coverage, dc:temporal, edm:year – Enrichment rules: exact matching? – Result: 13.3M objects enriched http://www.europeana.eu/portal/search.html?query=edm_timespan%3A* Issues? – Some words (qualifiers to dates, e.g. “made”, “printed”…) have removed from fields prior to enrichment, but this is only done for English records – So again a problem of quality or structure of the provided metadata – Huge issues with BC dates, but also date ranges (e.g. “1701/1800" is mapped to "1701" only) 20-12-2013Roxanne Wyns - IBW 62
    • 28-3-2014 32 Semantic enrichment @ Europeana [8] Pitfalls • General problems – Not enough suitable multilingual sources for the DCH domain – Automatic enrichment vs. manual enrichment – Quality of the metadata • Possible solutions – Indexing vs. display elements – Full implementation of EDM – Further extension of EDM – Gather basic vocabularies and existing multilingual terminologies – Provide a platform for contributing to translations and mapping vocabularies – Collect lists for certain metadata fields with limited amount of values, such as format, language, country, date-time rages,… – Create awareness on Europeana enrichment! 20-12-2013Roxanne Wyns - IBW 63 Semantic enrichment @ Europeana [9] Opportunities • Multilingual access to over 28 milj. records • More enriched elements • Freely available for re-use (DEA) • Closer to original metadata thanks to EDM • Data can be contextualized, semantically linked to other data • Allows for richer semantic query expansion & cross-collection browsing 20-12-2013Roxanne Wyns - IBW 64
    • 28-3-2014 33 Some inspiring examples… 20-12-2013Roxanne Wyns - IBW 65 Examples [1] www.thepund.it 20-12-2013Roxanne Wyns - IBW 66
    • 28-3-2014 34 Examples [2] 20-12-2013Roxanne Wyns - IBW 67 Examples [3] 20-12-2013Roxanne Wyns - IBW 68 www.researchspace.org
    • 28-3-2014 35 Examples [4] 20-12-2013Roxanne Wyns - IBW 69 Questions? Thank you! Roxanne Wyns – LIBIS, KU Leuven Roxanne.Wyns@libis.kuleuven.be 20-12-2013Roxanne Wyns - IBW 70
    • 28-3-2014 36 Questions? Thank you! Roxanne Wyns – LIBIS, KU Leuven Roxanne.Wyns@libis.kuleuven.be www.libis.be www.libis.be Sources • Europeana portal: http://www.europeana.eu/portal/ ; http://www.europeana.eu/portal/api • Europeana Professional: http://pro.europeana.eu/ – Introduction to the Europeana Data Model (EDM) – Europeana Data Model (EDM) documentation – Europeana Buisiness Plan 2014 • SPARQL end-point of data.europeana.eu: http://europeana.ontotext.com/ • Towards a Semantic Research Library: Digital Humanities Research, Europeana and the Linked Data Paradigm, Prof. Dr. Stefan Gradmann (KU Leuven) • Europeana 1914-1918: http://www.europeana1914-1918.eu/ • DM2E: http://dm2e.eu/ • Pundit: www.thepund.it • The ResearchSpace: www.researchspace.org • Poisonous India or the Importance of a Semantic and Multilingual Enrichment Strategy, Marlies Olensky, Juliane Stiller, and Evelyn Dröge • Semantic enrichment at Europeana – memo, November 4, 2013, Antoine Isaac • Mapping survey LIDO 2013-10-30, Regine Stein 20-12-2013Roxanne Wyns - IBW 72