Presentation on the DM2E data model, a specialisation of the EDM for the domain of (handwritten) manuscripts. Held at the EDM-Tutorial (22.09.) at the TPDL 2013 on Malta.
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
DM2E Data Model
1. co-funded by the European Union
The DM2E Data Model
Specialising the EDM for Handwritten Manuscripts
Steffen Hennicke
Berlin School of Library and Information Science
Humboldt-Universität zu Berlin
22. September 2013
EDM-Tutorial at TPDL 2013
2. EDM-Tutorial, TPDL 2013, 22.09.2013 222.09.2013
Digitised Manuscripts to Europeana
• Consortium with partners from Germany, Austria,
Norway, Greece, UK, France, and Italy funded by the
EU (2012-2015).
• DM2E works on
– a tool-chain for data migration to Europeana and the LOD
Web (OMNOM),
– a digital research environment for the Digital Humanities
(PUNDIT),
– an open community of cultural heritage professionals
(OPENGLAM)
4. EDM-Tutorial, TPDL 2013, 22.09.2013 422.09.2013
Content
• Wittgenstein Source (UiB): Manuscripts by Ludwig
Wittgenstein
TEI-XMLTEI-XML
TEI-XMLTEI-XML
•
descriptive metadatadescriptive metadata
•
object dataobject data
•
digitised manuscriptsdigitised manuscripts
•
transcriptstranscripts
•
pages, paragraphs etc.pages, paragraphs etc.
5. EDM-Tutorial, TPDL 2013, 22.09.2013 522.09.2013
Content
Islamic Scientific ManuscriptIslamic Scientific Manuscript
Initiative (MPWIG)Initiative (MPWIG)
Relational DatabaseRelational Database
Nietzsche Source –Nietzsche Source – DigitaleDigitale
Faksimile GesamtausgabeFaksimile Gesamtausgabe
(CRNS)(CRNS)
ProperietaryProperietary
European Association forEuropean Association for
Jewish CultureJewish Culture (EAJC)(EAJC)
EADEAD
Codices and Austrian BooksCodices and Austrian Books
Online (ONB)Online (ONB)
MARCXML/MAB2MARCXML/MAB2
118.000+ items with118.000+ items with
20.006.930+ pages20.006.930+ pages
6. EDM-Tutorial, TPDL 2013, 22.09.2013 622.09.2013
DM2E Data Model
• Semantically and structurally heterogeneous data
– e.g. EAD, METS, TEI, MARCXML and MAB2, relational databases,
proprietary schemas
• The Europeana Data Model (EDM) is made for this
scenario!
– provides a generic semantic interoperability layer
– enables the definition of “applications profiles” which may
address the needs of specific communities
• The DM2E Data Model (DM2E)
– is an “application profile” of the EDM for the domain of
handwritten manuscripts
– retains rich descriptions by specialising the EDM
7. EDM-Tutorial, TPDL 2013, 22.09.2013 722.09.2013
What Does “Specialising” Mean?
• RDF(S) allows the specialisation
of EDM classes and properties
– use of rdfs:subClassOf
– use of rdfs:subPropertyOf
• An “application profile” typically
also includes
– additional ontological restrictions
– documentation
8. EDM-Tutorial, TPDL 2013, 22.09.2013 822.09.2013
Guidelines for Specialising
• Empirical analysis of provided source metadata
• Iterative mappings to the EDM
• Close cooperation with data providers
– agree on shared conceptualisations
• Create rich and connected representations
– retain original semantics as much as possible
– use existing URIs of resources
– assign a class to the resources (rdf:type)
9. EDM-Tutorial, TPDL 2013, 22.09.2013 922.09.2013
Reuse of Existing Namespaces
• Create new classes or properties in the DM2E-Namespace only if
there is no other suitable option available
– reuse existing namespaces (ontologies)
– mind existing semantics (scope notes, domains, ranges)
• Types, roles and relations between agents
– Friend-of-a-Friend (FOAF) [FOAF] (types of agents)
– Publishing Roles Ontology (PRO) [SPAR] (roles of agents in the publication
process)
– VIVO [VIVO] (types of agents)
• Detailed semantics on bibliographic entities
– FRBR-aligned Bibliographic Ontology (FaBiO) [SPAR]
– Citation Typing Ontology (CiTO) [SPAR]
– Bibliographic Ontology (BIBO) [BIBO]
10. EDM-Tutorial, TPDL 2013, 22.09.2013 1022.09.2013
Classes
• 23 new or reused classes, mainly for
– physical and conceptual parts of a
handwritten manuscripts
• as found in our source metadata
– different types of Agents
11. EDM-Tutorial, TPDL 2013, 22.09.2013 1122.09.2013
edm:PhysicalThing
Physical and tangible aspects of
handwritten manuscripts.
12. EDM-Tutorial, TPDL 2013, 22.09.2013 1222.09.2013
Contextual Resources: Concept
Conceptual or logical aspects of
handwritten manuscripts.
13. EDM-Tutorial, TPDL 2013, 22.09.2013 1322.09.2013
Contextual Resources: Agent
Different types of agents.
14. EDM-Tutorial, TPDL 2013, 22.09.2013 1422.09.2013
Properties
• Property-centric modelling
– more than 50 new properties
• Documentation for the DM2E Data Model contains only EDM properties
which are utilized
– to keep the documentation clear
– e.g. dcterms:replaces, dc:source, or dc:conformsTo are not used
• Domain and Range Restrictions
– some OWL-Restrictions on properties in order to encourage the use of specific
resources of a specific type, e.g.
• CHO hasPart CHO
• WebResource hasPart WebResource
• Some EDM-Properties are mandatory in DM2E
– dc:type: at least one of the physical (e.g. dm2e:Page) or logical (e.g.
dm2e:Paragraph) aspects
– dc:subject: ideally an URI from a controlled vocabulary
15. EDM-Tutorial, TPDL 2013, 22.09.2013 1522.09.2013
ore:Aggregation
• Properties for creation, version, and modification of
records extended to the Aggregation
– dcterms:rightsHolder, dcterms:creator, dc:contributor with
range edm:Agent
– dcterms:created with range Literal (xsd:DateTime)
– dcterms:modified with range Literal (xsd:DateTime)
• dm2e:hasAnnotatableVersionAt
– for Pundit to include object data
– specific to the DM2E infrastructure
16. EDM-Tutorial, TPDL 2013, 22.09.2013 1622.09.2013
edm:ProvidedCHO
• About 49 new properties with the domain
edm:ProvidedCHO
– focus on object-centric descriptions
– event-centric descriptions rely on the existing EDM facilities
• Rich description of the object of interest
– prepare for future scholarly applications
– enable usage scenarios which need an interoperability
umbrella and rich semantics
– example: Pundit instance for the “Wittgenstein Incubator”
18. EDM-Tutorial, TPDL 2013, 22.09.2013 1822.09.2013
Related Resources
• Descriptive and contextual relations of the
ProvidedCHO to other resources
domain: edm:ProvidedCHO
range: various types of resources
19. EDM-Tutorial, TPDL 2013, 22.09.2013 1922.09.2013
Related Resources
domain: edm:ProvidedCHO
range: various types of resources
20. EDM-Tutorial, TPDL 2013, 22.09.2013 2022.09.2013
Related Resources
domain: edm:ProvidedCHO
range: various types of resources
22. EDM-Tutorial, TPDL 2013, 22.09.2013 2222.09.2013
Identifier and Title
domain: edm:ProvidedCHO
range: rdfs:Literal
23. EDM-Tutorial, TPDL 2013, 22.09.2013 2322.09.2013
Object Relations
• Domain and range “restrictions” to encourage proper use
of structural and other inter-object relations
• Implemented by using owl:Restriction on classes for
– dcterms:hasPart
– dcterms:isPartOf
– edm:isDerivativeOf
– edm:isNextInSequence
• Examples
– edm:ProvidedCHO dcterms:hasPart edm:ProvidedCHO,
– edm:WebResource dcterms:isPartOf edm:WebResource
– edm:Place dcterms:isPartOf edm:Place
– edm:TimeSpan dcterms:hasPart edm:TimeSpan
24. EDM-Tutorial, TPDL 2013, 22.09.2013 2422.09.2013
Summary
• The DM2E Data Model is an application
profile of the EDM for the domain of
handwritten Manuscripts
• DM2E v1.0: Latest and first operational
version of the DM2E data model
• Work is on-going and feedback welcome!
WP1 aggregates, analysis, and maps data from content providers. WP2 converts data to the DM2E data model and delivers it to Europeana and WP3 and exposes it as LOD. WP3 uses the DM2E data in Pundit and exposes the result of annotation activities as LOD. OMNOM: Bundle von Webservices für die Mapping und Konversion nach EDM/DM2E.
EDM (and Europeana) offers the chance for a broadly accepted generic interoperability layer which at the same time allows for easy specialisation of its semantics.