In the last years IIIF became the “de facto” standard for presenting, navigating and delivering digital images on the web all over the world. It defines several APIs for providing a standard method for describing, analysing and sharing images over the web, as well as "presentation-based metadata" about structured sequences of images. However, images and, in particular, cultural heritage images, to be fully analysed, interpreted and enjoyed should be inserted in a “virtual ecosystem” in which they can be related with entities such as people, places, events, fonds, etc., according to different visions and interpretations.
Therefore, since 2017, we have been working at integrating IIIF in a Digital Library environment based on DSpace, the most used Open source Digital Asset Management System, developing a dedicated addon (starting from version 5), easily integrated with a set of external Image Servers, such as Cantaloupe or Digilib, and at extending DSpace data model as well, to structure contextual relationships among cultural heritage entities at different levels.
After DSpace 7 release, we worked with the community at integrating IIIF support in the official DSpace codebase. Now the DSpace REST API implements the IIIF Presentation API version 2.1.1, the IIIF Image API version 2.1.1, and the IIIF Search API version 1.0 (experimental). Any IIIF compliant image server can be integrated. The DSpace Angular frontend uses the Mirador 3.0 viewer.
However, Digital Library requirements are getting complex and complex. Therefore, to fulfil the needs of the cultural heritage domain, we enhanced our solutions based on DSpace 7, developing two further add-ons to integrate and enrich the “IIIF experience” within DSpace: the Document Viewer (for visualizing PDF files within Mirador) and the OCR module (for extracting text from images and indexing it).
Integrating IIIF and DSpace 7 and enriching the platform with new features, it has been possible to go beyond the traditional boundaries of the Digital libraries, structuring a complex system of relationships, building new narratives thanks to interdisciplinarity and the coexistence of different domains.
The proposed 2 hours workshop, addressed to librarians, archivists, historians, archaeologists, researchers and to all those who want to build their own digital library with DSpace 7 and IIIF, will introduce the attendees to the IIIF integration in DSpace both from the backend and from the frontend side.
We will analyze and share our approach and standard workflows for managing cultural heritage documents in DSpace using IIIF, starting with images submission and describing the operations required to make images available to the Mirador Image Viewer, the ones for extracting the text via OCR and for visualizing PDFs through the Image Viewer. Moreover, we will show how to relate items to each other, in order to build a complex system of relationships between entities, to be explored through network graphs.
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
IIIF and DSpace 7 - IIIF Conference 2023.pdf
1. IIIF and DSpace 7:
integrating tools for
exploring Digital
Cultural Heritage
Claudio Cortese (4Science), Andrea Bollini
(4Science)
2. Agenda
Adding support
for IIIF
to DSpace 7
DSpace and
Digital Libraries
DSpace GLAM
• An ecosystem built
on top of IIIF
• IIIF Image Viewer
• Document Viewer
• OCR & Transcription
Extending the
DSpace data
model
Structuring
Digital Cultural
Landscapes
Cultural Paths
and Storytelling
9. About
DSpace
DSpace supports many file formats, but
historically has lacked good support for
presenting and sharing digital objects
online.
Within the DSpace community interest
in IIIF has existed for several years. Both
local and commercially-supported
solutions (4Science) have been created.
A IIIF integration is now officially
distributed as part of DSpace 7.
10. Integrating IIIF in a Digital
Library Environment
Since 2017, 4Science has been
working at:
- integrating IIIF in a Digital Library
environment based on DSpace,
developing a dedicated addon
(starting from version 5), easily
integrated with a set of external
Image Servers, such as Cantaloupe
or Digilib
11. Together we can WILLAMETTE UNIVERSITY BEGUN
WORKING ON DSPACE 7 AND IIIF
FOR ENHANCING ACCESS TO
THEIR DIGITAL CONTENTS
4SCIENCE AND WILLAMETTE
UNIVERSITY COLLABORATED TO
BRING IIIF TO THE DSPACE
COMMUNITY
12. DSpace 7 and IIIF
together
• Thanks to the fundamental support of
Willamette University in 2021 IIIF
support was implemented in DSpace 7
• DSpace 7.1 Release: October 2021
• Initial IIIF support: Presentation API 2.1.1,
Image API 2.1.1, Search API (experimental),
Mirador 3.
13. DSpace and IIIF
together
• Now the DSpace 7 IIIF support
allows institutions to upload images
in DSpace getting automatically a IIIF manifest
for the item, based on item and bitstream
(images) level metadata.
14. Driven by metadata
IIIF configuration at the Item-
level is quite flexible and is
managed using metadata.
Canvas sizes, image labels,
ranges and and other settings
are controlled by using the
following fields.
The dspace.iiif.enabled
metadata field MUST be added
to the Item and set to "true".
Otherwise, the Item display will
use the default DSpace view.
17. Embedded Mirador viewer
• Viewer layout is defined
in the default Mirador
configuration.
• Search options are added
dynamically based on Item
metadata.
• Support for indexing OCR
files is not currently
provided by DSpace.
Institutions will need to
develop their own
approach to indexing their
data.
26. DSpace and Digital
Libraries
• Actually implementing IIIF is a
fundamental achievement in DSpace
history, since it is going to promote its
use in contexts such as those related
to scientific field and digital cultural
heritage management
• However digital images management is
only a first step since Digital Libraries
are getting complex and complex
27. Digital and the
fragmentation risk
"Where is the wisdom we
have lost in knowledge?
Where is the knowledge
we have lost in
information?"
Where is the information
we have lost in data ?
28. Digital Libraries are the main tools,
in the Humanties, for arriving at re-
composing knowledge and at
extending it
30. Digital
Libraries in
the 21st
century
• Digital Libraries should not be considered as
mere lists of items grouped into collections
• They are tools allowing the definition of
relationships on different scales and
according to different variability dimensions,
in order to reconstruct digital cultural
landscapes
• A document can be explored and analyzed
in relation to other documents and to all the
information helping to define its context, or
rather its different contexts (historical,
geographical, cultural, etc.)
31. DSpace-GLAM • To fulfil these needs we extended DSpace and created DSpace-GLAM
• An extension of DSpace aimed at managing Digital Cultural Heritage
• Provide IIIF based add-ons for curating and exploring digital objects
• Provide an extended and extensible data model to display contextual
relationships at different levels and to manage different metadata
schemas and conceptual models
32. DSpace-GLAM: built on
top of DSpace 7
The current release is based on DSpace 7.
It relies on the new DSpace 7 technological stack
- REST API
- ANGULAR UI
However, DSpace-GLAM provides much more
features for managing Cultural Heritage than DSpace
33. DSpace-GLAM
DSpace-GLAM it's a solution for
managing Digital Libraries
The GLAM acronym highlights
the power to include
and manage the cultural multi-
domain: ancient and
modern books, fonds,
museums objects, documents,
video, audio, maps, …
34. Galleries, Libraries,
Archives, Museums
Materials from Museums, Archives, Libraries can be
explored in an integrated way, but without losing the
granularity of cataloging required by the respective
domain standards.
Cultural Metadata Italian and International Standards
(ICCD, ICAR, ICCU, EDM, etc.)
Metadata related to other domains (tourism, botany)
35. We developed 3 add-ons to
"enrich" the IIIF
experience.
• Image Viewer
• Document Viewer
• OCR & Transcription
The 3 add-ons implement
several curation task for an
easier digital resources
management
36. IIIF Image Viewer related curation tasks
Upload
Upload
images to
the IIIF
Image
Server
(photo
gallery)
Upload
Upload
images to
the IIIF
Image
Server
(digitized
book)
Forbid
Forbid file
download
for files
uploaded on
image
server
Allow
Allow file
download
for files
uploaded on
image
server
Create
Create
access
image for
RAW Types
Create
Create
multipages
PDF from
IIIF Images
Remove
Remove all
metadata
bundles
created by a
previous
upload
41. Modeling Structural
Metadata
• Structural Metadata can be
uploaded in batch:
• using the Simple Archive
Format (as in DSpace
community edition)
• using an Excel file
45. Document Viewer related curation tasks
Extract images from PDF (CMYK)
Extract images from PDF (RGB)
Extract images from a scanned PDF
Extract raw images inside the PDF (no OCR)
Rebuild the PDF ToC
46. The OCR & Transcription Module Allows to extract text
from images and index it
47. Complete OCR
management
• By means of its curation
tasks, the module is able:
• to extract OCR from
images
• to index the extracted
text
48. OCR & Transcription module's curation tasks
Extract
Extract text (hOCR)
from images
Send
Send OCR to the
annotation server
Consolidate
Consolidate hOCR
for the fulltext
indexing
51. Extending
DSpace
data model
Digital Object
(Book, Archival
Document,
Picture, Museum
Object, etc.)
Fonds
Person
Event
Place
Path
Aggregation
Journal Fonds
Project
OrgUnit
Entity
Entity
Entity
Digital Object
Annotation
52. The data model
- Links the digital object with
People, Places, Events,
Fonds, ecc.
- Having an overview of
artistic productions,
thematic and
historical paths,
- Define a
relationships network to be
explored, navigated and
studied
71. Structuring Digital
Cultural Landscapes
Through DSpace-GLAM, today many institutions
are enhancing the relationships among their
content, shaping "their" digital cultural landscapes
according to the dimensions of variability needed
Digital cultural landscapes are “virtual
ecosystems” in which digital cultural heritage
subsets are related with entities such as people,
places, events, fonds, etc., according to different
visions and interpretations, in order to generate
new knowledge and to open up new perspectives.
Such "digital landscapes" can be visualized either
as Semantic Networks, Paths or Aggregations.
72. The Network
Lab: explore the
relationships
graph
Based on the
relationships defined
at the data model
level, DSpace-GLAM is
able to construct
graphs, thanks to the
Network Lab
73. The Network
Lab: explore the
relationships
graph
For example you can navigate
through the different Byzantine
emperors who commissioned
interventions on a building and
explore their relationships with
architects and workers who
actually did the work
In this way it is possible to uncover
"hidden relationships" producing
new knowledge
82. Paths
Paths mainly tell "stories."
In DSpace-GLAM, they have been enhanced and
fully integrated into the data model
Paths can also include objects of different types
(Documents, Photographs, People, Events, Places,
etc.).
The same object can be included in different Paths
84. Paths creation and
storytelling
With DSpace-GLAM, it is easy to create paths by
relating different entities, highlighting contexts,
structuring exhibitions and enhancing itineraries
Greater interaction between text and digital
resources makes it possible to build real narratives
around cultural heritage
90. In DSpace-GLAM each digital object and each
concept (Person, Event, Place, Path, etc.) is a
node in a single semantic network and IIIF is
perfectly integrated in it
91. Thank you for your attention
CLAUDIO CORTESE (4SCIENCE) CLAUDIO.CORTESE@4SCIENCE.COM