PDF/A as a Better Format Than TIFF for OAIS Submission Packages
1. Beyond TIFF and JPEG2000: PDF/A as an OAIS submission
information package container
Presentation of Massoud Mortazavi
Student of MS in Information Science
SBU University
Master Name: Mrs. Pakdaman
Han, Y. (2015). Beyond TIFF and JPEG2000: PDF/A as an OAIS submission information package
container. Library Hi Tech, 409 - 423.
HTTP://DX.DOI.ORG/10.1108/LHT-06-2015-0068
2. Abstract
Purpose
introduce PDF/A to replace TIFF as the preferred file format for digitization of
textual documents
Methodology
reviewed the current digitization guidelines, the OAIS model and provides on an
overview of the development PDF and PDF/A as international standards.
2
3. Abstract
Findings
TIFF file format has been the preferred master file format
PDF/A has been the preferred standard for coding born-digital documents
PDF/A can be used as an OAIS SIP container
Background
More Than 20 Years Digitalization's In Libraries
Digital Library Federation (DLF) have published several critical digitization guidelines
3
4. Standardization of PDF as PDF/A
Format
Standardization of PDF as PDF/A Format Started in
2005:
PDF/A-1: (PDF 1/4): ISO 19005-1:2005
PDF/A-2: (ISO 32000-1): ISO 19005-2: 2011
PDF/A-3: Use of ISO 32000-1 with support for
embedded files (PDF/A-3)(PDF 1/7): ISO 19005-3:
2013
4
6. PDF Versions
PDF 1.4:
Version 1.4 was the basis for the first versions of ISO standards PDF/X and
PDF/A
PDF 1.7:
The original version 1.7 of the PDF format was released November 2006 and
associated with Acrobat and Adobe Reader 8.0. Version 1.7 was published as ISO
32000-1 in July 2008
6
7. PDF/A as an OAIS SIP container
The key requirement of PDF/A is that it is self-described and self-contained so that
it can be reproduced exactly the same way with different software in various
platforms.
All of the information necessary for displaying the document is embedded in the
PDF/A file.
text, raster images and vector graphics, fonts and color profiles
7
8. PDF/A as an OAIS SIP container
(1) tagged PDF: embed structural metadata via pre-defined PDF tags or create your
own tags;
(2) self-contained: embed required color profiles, fonts and other related information;
and
(3) self-described using extensible metadata platform (XMP) metadata: PDF/A can
code all the required information from an OAIS SIP through the standard and XMP.
8
9. TIFF As a good Format
For the past 20 years TIFF 6.0 has been the preferred master file format for
digitization due to a few factors such as availability of the technical specification
and easy-to-understand file structure.
TIFF is very simple
easy to repair
Migrate
9
10. The Problems of TIFF
it cannot include layers and JPEG (Its Not True)
TIFF 6.0 is an open standard (But it Should Use a License that its not Open Standard,
actually its not OPEN STANDARD)
Big File Size
Inflexible for web and mobile delivery
Indexing is difficult
OCR, XMP, ALTO XML is not Supported
METS Not Supported (?)
TIFF tags are difficult to work with
10
11. What about PDF?
Open International Standards
Self-contained and self-described.
Flexible
Space saving
Better metadata support with XMP
Other files or data. PDF/A-3 has the ability to have any file or data encoded
11
13. Summary
PDF and PDF/A as international standards since 2005
PDF/A has been widely accepted as the preferred master file format for born-
digital documents, but it has not been recommended for digitization
Every PDF/A Formats (1,2,3) Can be Used for Some Digitalization
The author shows how PDF/A is a better file format than current preferred TIFF
and JPEG2000
13
14. References
Guidelines for TIFF Metadata Recommended Elements and Format
http://www.iso.org
http://www.digitalpreservation.gov
The Use of PDF in Digital Archives
The Use of PDF/A in Digital Archives: A Case Study from Archaeology
https://en.wikipedia.org/wiki/Extensible_Metadata_Platform
14