Systems & processes; making
order out of chaos.
Digitisation Open Day, January 2014
Digital Curator, Wellcome Library
Digitisation – process overview
Funding, staff, equipment, IT,
storage, data management
Refine & review processes document & share
Lets be clear. Sticking
something under a
camera or on a scanner
is the last step in a
There are simpler models…
We have three basic systems…
1. Workflow management system – ‘Goobi’ –
2. Digital object repository – ‘Safety Deposit Box’ –
3. Front end - ‘the player’ – access.
Remember, this doesn’t include cataloguing or bibliographic systems. Here
we’re just talking about the process of creating, storing & delivering digital
content. You have to assume that those other systems are also in place.
• JPEG2000 is our master image format.
• Create dissemination images (JPEG) on the fly.
• Also use PDF, MPEG2, MP3
We don’t have a system of ‘preferred formats’ for digitisation. We use a small
number of ‘master’ formats for efficient data management but we give
consideration to the way in which we disseminate information. JPEG2000 is a
flexible format that allows us to present digitised content in a variety of ways,
whilst allowing for the automated creation of different sizes of JPEG.
• Manages & tracks the production of content.
• Workflow driven. Highly automated. Project
• Allows us to set very granular access conditions.
• Scalable & highly adaptable to different projects.
Goobi is our workflow tracking & management system for the production of
digital content. Automating as many of Goobi’s processes as possible allows
our work to be both efficient & scalable. Goobi is also the system with which
humans interact the most.
How SDB works – behind the scenes
• No public access to SDB.
• Little direct staff access to SDB content.
• High levels of automation of ingest, Goobi.
• Platform for dissemination mediated by the player.
A centralised repository of & for digital content is a key part of both
preservation of & access to your content. It’s a single place where we both
store & manage our content.
How the player works
• Makes HTTP request to SDB for content based on
SDB PUID (Objects unique & permanent ID).
• Draws & implements access conditions from
• Permitted user actions drawn from METS.
• Draws DMD from live catalogue.
The player acts as a single point of access to our content, we have a unified
delivery mechanism through which all content is delivered. Aim is to provide as
seamless & as easy as possible access to all digital content. Easy for the
user to understand & an interface with which they can quickly become familiar.
The systems overview
• Goobi. Manages & tracks the production of
• SDB. Repository that stores digitised content
along with its DMD & AMD.
• Player. User interface to view digitised material.
Lessons from Goobi
• Design your workflows (Human & digital) in
advance. But be flexible.
• Automate as much as possible, saves time &
• Document processes & procedures.
• Share what you learn.
Lessons from SDB
• Plan your systems integration, which system talks
to which, and how.
• Plan workflows & processes.
• Data management plan. Your eggs in one basket.
• Plan what you’ll do when it all turns to custard.
Lessons from the player
• The point of digitisation is access & managed
access is part of preservation.
• Automate access in terms of what a user can do
• Single point of access for all digital content.
• Test user interface & develop with user in mind!
So, to wrap up…
• Digitisation is an end to end process that brings
together objects & metadata.
• Have to think about the whole system to deliver
results. Process is one of combining metadata
from different systems.
• Document plans & document process.
• Be prepared to be flexible & to change as
necessary. But try to stick to the plan!
Questions now, questions later…?
Dave Thompson, Digital Curator
firstname.lastname@example.org - @d_n_t