1. 10/7/13
1
Preservation Strategies for
Digital Image Collections
Franziska Frey"
Malloy Rabinowitz Preservation Librarian"
Head of Preservation and Digital Imaging Services"
Harvard Library
Requirements To Make Digital Work
• Deep and longstanding institutional
commitment to traditional preservation
• Full integration of technology into
information management procedures and
processes
• Significant leadership in developing
appropriate definitions and standards
Responsibility
• “Digital preservation will only happen if
organisations and individuals accept
responsibility for it.
• Acceptance of responsibility should be
explicitly and responsibly declared…..”"
"
UNESCO, “Guidelines for the Preservation of Digital
Heritage,” 2003
Control
• Move objects to a safe place
• Uniquely identify and describe images with
appropriate metadata for resource discovery,
management, and preservation
• Use standardised metadata schemas for
interoperability
• Ensure that links between digital objects and
their metadata are securely maintained, and
that the metadata are also preserved."
"
UNESCO, “Guidelines for the Preservation of Digital Heritage,”
2003
Stewardship
• Long-term management of heritage materials
(digital objects) through collaboration, throughout
all phases of object life cycle.
– Rights holders
– Collection managers
– Repository/preservation staff
– Centers of expertise (researchers, scientists)
– Auditors
– Content users and their communities
2. 10/7/13
2
Stewardship ─ Collection"
Manager’s Responsibilities
• Intellectual property rights: manage legal rights,
including rights to make copies
• Metadata: provide appropriate administrative,
technical, and structural metadata for objects
• Discovery: ensure that description of objects are
publicly available in online discovery systems
• Access: ensure that a version of the object is available
to the Harvard community
• Financial considerations: pay for repository and
preservation services"
"
Harvard University Library, DRS Policy Guide
Techniques to Preserve Images
• Phase 1―Production
• Phase 2―Appraisal
• Phase 3―Deposit
• Phase 4―Archiving and Preservation
• Phase 5―Discovery and Delivery
Phase 1 ― Production"
• Imaging does matter
• Formats do matter
• Documentation does matter
Art-si.org
Image Quality Matters
• High quality images can be repurposed and are
worth maintaining
• Steve Puglia: ”We feel that the managed
environment needs to be extended beyond the
digital repository and forwarded in time to include
the digitization process….”"
(IS&T Archiving Conference, 2008)
Building Teams
• Preserving visual cultural heritage
materials involves one additional field:
Imaging Science
– It is imperative that the person
involved in creating these materials,
whether born digital or digitized, has
a good knowledge of imaging
3. 10/7/13
3
Consequences of These Decisions"
Vis-à-vis Preservation
• Resolution
– As size increases (e.g., decisions to
capture and keep 48 bit, high resolution
files), management overhead increases
• This holds true especially if the storage
unit bills per MB or GB per year
How are Digital Libraries Evaluated?
• Almost no research on implications of
image quality
• User interfaces and usability in terms of
finding the right image have been
evaluated
• Why this gap?
– Do users know what they can demand in terms of
image quality?
• Visual literacy
– Image quality studies are complex and expensive
Reproductions of Cultural Heritage
Materials Needed for…
• On-line databases
• Posters, calendars, and
postcards
• Exhibition catalogues
• Education
• Conservation
• And more
Survey—Imaging Purposes
To protect vulnerable originals from use
67%
To produce printed reproductions
77%
To make collection accessible over the Internet
86%
To include in a collection management system
86%
To document conservation treatment
58%
Other
28%
4. 10/7/13
4
However…
• Reproducing cultural heritage materials can be
difficult
– Color and texture
– Printing may be taking place half a world away
• It is of interest to limit the number of times an
artwork is imaged
– Potential for damage to the artwork
– Expensive
• Resources are limited
– Budget cuts
– Many institutions do not have dedicated reproduction
departments
Viewing Conditions
• Reproductions are viewed under various
lighting conditions…
– Museum shop, living room, class room
– Displays
• …even for image evaluation
– Light booth, gallery, office
• Significant issues
– Metamerism
– Color appearance
– Consistency
Project Objectives
• Determine the optimal reproduction processes
presently available
– Understand the workflow processes in use in
cultural heritage institutions today
– Determine the image quality inherent in these
processes in print and on line
– Understand the image quality expectations of the
users involved
• Develop a framework to serve as a guideline
for cultural heritage institutions to follow when
reproducing fine art
ImageMuse
• Establish a user group devoted to imaging,
archiving, and reproducing cultural heritage
• 17+1 institutions took part in our experiments
Image Quality Metrics
• Document current workflows
• Develop a practical characterization test
method: industry solutions
• Document available targets to measure
objective image quality
Workflow Charts
Capture
Illumination
Camera
Post-Processing
Proofing
Further Processes
5. 10/7/13
5
Documented Reproduction Workflows
Workflow Process
General Function
Specific Workflow Process Steps and
Considerations
Additional Steps and Considerations
1. Image capture
Objective targets used
Lighting set up used to illuminate the artwork
including polarization
Camera calibration
Flat-fielding
2. Proofing and
image file
preparation
Monitor Calibration
Working color space
Screen background used for file viewing
Viewing environment
Physical image size on the screen
Sharpening
Image orientation
Resolution and file size
3. Image delivery
File format
Image layers for documentation of image
processing conducted
ICC color management
Delivery media
Guide prints and proofs
4. Image archiving
Archiving protocol
Proper handling and storage of guide prints
Metadata
Image naming
Hidden-Target Paintings
Gamblin Artist’s Oil Colors
Comparison of Corrected Paintings
CS1 CS2 CS3 CS4
Universal Test Target
Results from UTT
Cameras—Color Performance
6. 10/7/13
6
Objective Targets
• Input targets—output targets
Experimentation
• Define quality criteria based on objective
and subjective metrics
• Develop a method to connect objective,
measurable image quality to subjective
image quality as perceived by the observers
• Benchmark current quality
Subjective Targets
Press Sheets
Experimental Methodology
• Emphasis on the perceptual image quality
of printed reproduction and on display
– Objective targets measured as well
• Evaluation performed using a variety of
pictorial “targets”
– Sent to a variety of cultural heritage institutions
for them to put through their imaging processes
7. 10/7/13
7
Images Printed at RIT’s Print
Applications Laboratory
• Heidelberg Speedmaster sheet-fed press
– ISO 12647
– Visual match to guide prints
– NewPage Sterling
80# Gloss Text
• HP Indigo
Digital Press
Perceptual Testing
• Observers experienced with fine art
reproduction
– Fine art photographers
– Curators
– Art historians
– Conservators
– Librarians
– RIT students & staff
Experiments Conducted
• The Impact Of Lighting On Perceived Quality Of
Fine Art Reproductions
• Evaluating CATs as Predictors of Observer
Adjustments in Softcopy Fine Art Reproduction
• Comparing Hardcopy and Softcopy Results In the
Study of the Impact of Workflow on Perceived
Reproduction Quality of Fine Art Image
• Evaluating Digital Printing for Fine Art
Reproduction
• Fine Art Reproduction Workflows for the Web
Environment
Objective Targets
Experimental Methodology
• 17 institutions participated
• 30 hard-copy renditions of each of image
were included
– 19 prints made ‘to the numbers’
– 11 visual matches made to guide prints
• All prints made on NewPage Sterling Ultra
80# Matte Text paper
• 16 soft-copy renditions used
• Variety of cameras and color spaces
8. 10/7/13
8
Psychophysical Testing
• Hard copy experiments followed rank order
protocol
– Observers ordered the prints from best to worst
reproduction or representation of the original
– Most to least preferred rendition
• Soft copy experiments followed paired
comparison protocol
– Best reproduction or representation of the original
– Most preferred rendition
Soft-copy set up
Hard-copy set up
Experimental Setups
Key Findings
• Results with and without the original present are more
consistent for hard-copy prints than soft-copy images
• Hard-copy results are more consistent with soft-copy
results when the original is present
– Original is typically not present when users are viewing fine art
reproductions
• Observers did not like lower contrast images when they
were electronically displayed
• Of interest to identify workflows that provide both
acceptable representations of the originals as well as
pleasing images on screen and in print
Color Difference (ΔEab) at Capture
Lightness Difference (ΔL) versus"
Perceptual Quality Rating (Z-score)
R
2
= 0.8144
0
2
4
6
8
10
12
-1 -0.5 0 0.5 1
Mean Z-scores
MeanDeltaL
Mean Delta L
Form 5
Form 11
Form 13
Form 19
Linear (Mean Delta L)
Experiments Conducted
• The Impact Of Lighting On Perceived Quality Of Fine Art
Reproductions
• Evaluating CATs as Predictors of Observer Adjustments in
Softcopy Fine Art Reproduction
• Comparing Hardcopy and Softcopy Results In the Study of
the Impact of Workflow on Perceived Reproduction Quality
of Fine Art Image
• Evaluating Digital Printing for Fine Art Reproduction
• Fine Art Reproduction Workflows for the Web
Environment
9. 10/7/13
9
Color Management Check
Web Experiment User Interface
Key Areas of Interest
Key Areas for Photographers versus
Other Occupations
Key Findings
• Testing conditions had a limited impact
on the preference judgments for these
images
• Ranking results for the experiments
conducted in the lab without the
original and via the web were highly
correlated, indicating that, when the
original is not included, a web-based
test may be a reasonable approach
Key Experimental Findings
• Camera make, lights, file format did not influence our
results
– Everybody is using equipment uniformly capable of doing this job
• Lighting conditions may have a strong impact on image
appearance
– Proofing protocols will have to be revisited
• The use of a target to ensure proper capture setup is
recommended
• Main goal: get the tone scale right at capture
• Following standardized workflows, ISO printing standards
and viewing standards reduces need for manual post
processing
10. 10/7/13
10
Key Findings−Interviews
• Define imaging goals and talk to your users
– This will help set expectations
• Acceptability varies for the different stakeholders –
this needs to be clearly communicated
• Document workflows in detail
– No undocumented processing should be performed
along the image interchange cycle
– The more often a file is touched the worse the results
• Close the communication loop in the image
interchange cycle
Art Image Interchange Cycle
Photographers
Paper Manufacturers
Conservators
Publication Staff
Curators
Printers
Visitors
Imaging Scientists
Standards Experts
Graphic Designers
Managers of
Imaging Studios
Art Historians
DAM staff
Equipment
Manufacturers
Digital Imaging
Specialists
Licensing Staff
Exhibitions
Editorial
Pre-press
Merchandising
Librarians
Metatorial
Future Work
• Standardization even more important with
globalized workflows
– ISO JWG 26: combining existing guidelines and
standards for quality evaluation of imaging systems
– Training for implementation of standards needed
– Define stepping stones to get to a standardized
workflow
• Bring all threads of imaging in an institution
under “one roof”
Techniques to Preserve Images
• Phase 1―Production
• Phase 2―Appraisal
• Phase 3―Deposit
• Phase 4―Archiving and Preservation
• Phase 5―Discovery and Delivery
Phase 2 ― Appraisal
• Deciding what is essential
– Characteristics that give object meaning,
integrity, authenticity
• Encode what is essential
– Metadata production
• Validating objects
– Are they what they seem to be?
Checksums
Metadata
• Descriptive
– You cannot preserve what you do not know you have
– You cannot sustain use for items that cannot be
identified
• Structural
– Encoding of relationships facilitates management, use
• Administrative
– Ownership, rights of access, provenance
• Technical/Preservation
– Format attributes
– Documentation of significant properties and
preservation intentions to inform preservation
strategies
11. 10/7/13
11
www.loc.gov/standards/premis/"
v2/premis-2-0.pdf
Metadata Containers
• Directory and file names
• File headers
• XML
– XMP (e.g., within JP2), EXIF, NISO MIX,
METS
• Database tables
• Printed reports
NISO
Phase 3 ― Deposit
• Choosing a repository: build or buy?
• Packaging data for deposit
• Validating data and objects
Phase 4 ― Archiving and Preservation
• Repositories
• Standards and guidelines
Storage Options
• Interim storage
– Digital asset management system
– Store data off-line on magnetic or
optical media
• Repository storage
– Build a repository
– Pay annual fee to use an external
repository
12. 10/7/13
12
Interim Storage─The Bare Essentials
• Assign checksums to images early in the production
process
• Document rationale for creating images
– At very least, include read me file on storage media;
database is best
• Avoid use of “meaningful” filenames
• Use new media
– Follow advice/recommendations of IT9.21 and IT9.23
standards
• Create duplicates and store duplicates in separate
locations
• Create explicit “links” between catalog records and
images
• Assign preservation responsibility to appropriate entity
Preservation Repository
• Long term storage strategy for
masters
– Preservation responsibilities delegated to
service provider
– OAIS
– Accountable, auditable and fiscally
sustainable
Managing Risk
• Security and access control
– Preventing unauthorized use, tampering or
theft
– Protecting rights holders
• Data obsolescence
– Media incompatible with players
– Formats
• Functional obsolescence
– Formats incompatible with user needs
• Fiscal obsolescence
Phase 5 ― Discovery and Delivery
• Digital library infrastructure
– Catalog or other database for descriptive
information
– Persistent naming
– Access management
• Rendering
– Hardware, web browsers
– Emulator
Pricing Components"
• Various pricing models
– Subscription (JSTOR)
– Storage (Harvard Digital Repository
Service)
– Accession, subscription and storage
(OCLC Digital Archive)
Managing Costs
• Minimize number of conservation and
reformatting interventions over entire life-
cycle
• Manage the storage environment
– “Geography is preservation destiny”
• Negotiate costs of outsourced services,
e.g., through consortia
13. 10/7/13
13
Summary
• Digitization is not preservation
• Storage is not synonymous with digital
preservation, and storage is neither free nor cheap
• Stewardship and digital preservation require active
oversight of content, technologies, and user
expectations
• Preservation planning depends and relies upon
extensive, well-managed metadata
• Distributed, but shared expertise centers and tools
will be essential to managing costs
Acknowledgements
The Andrew W. Mellon Foundation
Participating Institutions
Susan Farnand, RIT
Observers
Steven Chapman, Harvard University