SlideShare a Scribd company logo
1 of 15
Beyond TIFF and JPEG2000: PDF/A as an OAIS submission
information package container
Presentation of Massoud Mortazavi
Student of MS in Information Science
SBU University
Master Name: Mrs. Pakdaman
Han, Y. (2015). Beyond TIFF and JPEG2000: PDF/A as an OAIS submission information package
container. Library Hi Tech, 409 - 423.
HTTP://DX.DOI.ORG/10.1108/LHT-06-2015-0068
Abstract
 Purpose
introduce PDF/A to replace TIFF as the preferred file format for digitization of
textual documents
 Methodology
reviewed the current digitization guidelines, the OAIS model and provides on an
overview of the development PDF and PDF/A as international standards.
2
Abstract
 Findings
TIFF file format has been the preferred master file format
PDF/A has been the preferred standard for coding born-digital documents
PDF/A can be used as an OAIS SIP container
 Background
More Than 20 Years Digitalization's In Libraries
Digital Library Federation (DLF) have published several critical digitization guidelines
3
Standardization of PDF as PDF/A
Format
Standardization of PDF as PDF/A Format Started in
2005:
PDF/A-1: (PDF 1/4): ISO 19005-1:2005
PDF/A-2: (ISO 32000-1): ISO 19005-2: 2011
PDF/A-3: Use of ISO 32000-1 with support for
embedded files (PDF/A-3)(PDF 1/7): ISO 19005-3:
2013
4
5
PDF Versions
PDF 1.4:
Version 1.4 was the basis for the first versions of ISO standards PDF/X and
PDF/A
PDF 1.7:
The original version 1.7 of the PDF format was released November 2006 and
associated with Acrobat and Adobe Reader 8.0. Version 1.7 was published as ISO
32000-1 in July 2008
6
PDF/A as an OAIS SIP container
 The key requirement of PDF/A is that it is self-described and self-contained so that
it can be reproduced exactly the same way with different software in various
platforms.
 All of the information necessary for displaying the document is embedded in the
PDF/A file.
 text, raster images and vector graphics, fonts and color profiles
7
PDF/A as an OAIS SIP container
(1) tagged PDF: embed structural metadata via pre-defined PDF tags or create your
own tags;
(2) self-contained: embed required color profiles, fonts and other related information;
and
(3) self-described using extensible metadata platform (XMP) metadata: PDF/A can
code all the required information from an OAIS SIP through the standard and XMP.
8
TIFF As a good Format
 For the past 20 years TIFF 6.0 has been the preferred master file format for
digitization due to a few factors such as availability of the technical specification
and easy-to-understand file structure.
 TIFF is very simple
 easy to repair
 Migrate
9
The Problems of TIFF
 it cannot include layers and JPEG (Its Not True)
 TIFF 6.0 is an open standard (But it Should Use a License that its not Open Standard,
actually its not OPEN STANDARD)
 Big File Size
 Inflexible for web and mobile delivery
 Indexing is difficult
 OCR, XMP, ALTO XML is not Supported
 METS Not Supported (?)
 TIFF tags are difficult to work with
10
What about PDF?
 Open International Standards
 Self-contained and self-described.
 Flexible
 Space saving
 Better metadata support with XMP
 Other files or data. PDF/A-3 has the ability to have any file or data encoded
11
12
ALTO XML
Summary
 PDF and PDF/A as international standards since 2005
 PDF/A has been widely accepted as the preferred master file format for born-
digital documents, but it has not been recommended for digitization
 Every PDF/A Formats (1,2,3) Can be Used for Some Digitalization
 The author shows how PDF/A is a better file format than current preferred TIFF
and JPEG2000
13
References
 Guidelines for TIFF Metadata Recommended Elements and Format
 http://www.iso.org
 http://www.digitalpreservation.gov
 The Use of PDF in Digital Archives
 The Use of PDF/A in Digital Archives: A Case Study from Archaeology
 https://en.wikipedia.org/wiki/Extensible_Metadata_Platform
14
15

More Related Content

Similar to PDF/A as a Better Format Than TIFF for OAIS Submission Packages

What is PDF/X?
What is PDF/X? What is PDF/X?
What is PDF/X? DeftPDF
 
The importance of standards
The importance of standardsThe importance of standards
The importance of standardsiText Group nv
 
January 2006 Archival Storage Strategies and Technologies Presentation
January 2006 Archival Storage Strategies and Technologies PresentationJanuary 2006 Archival Storage Strategies and Technologies Presentation
January 2006 Archival Storage Strategies and Technologies PresentationJohn Wang
 
October 2006 Impact of PDF/A on Content Management by Christy Hubbard
October 2006 Impact of PDF/A on Content Management by Christy HubbardOctober 2006 Impact of PDF/A on Content Management by Christy Hubbard
October 2006 Impact of PDF/A on Content Management by Christy HubbardJohn Wang
 
Presentation1
Presentation1Presentation1
Presentation1f6aim
 
What is PDF/A?
What is PDF/A?What is PDF/A?
What is PDF/A?DeftPDF
 
PRESENTATION: Challenges of Digitization (November 2012)
PRESENTATION: Challenges of Digitization (November 2012)PRESENTATION: Challenges of Digitization (November 2012)
PRESENTATION: Challenges of Digitization (November 2012)Adlib - The PDF Experts
 
PDF/Archive: Preserving Electronic Assets
PDF/Archive: Preserving Electronic AssetsPDF/Archive: Preserving Electronic Assets
PDF/Archive: Preserving Electronic AssetsBetsy Fanning
 
Different file types
Different file typesDifferent file types
Different file typesDeftPDF
 
Apago Pdfx Nyc Seminar Fs Presentation
Apago Pdfx Nyc Seminar Fs PresentationApago Pdfx Nyc Seminar Fs Presentation
Apago Pdfx Nyc Seminar Fs PresentationDwight Kelly
 
PDF Generation in Rails with Prawn and Prawn-to: John McCaffrey
PDF Generation in Rails with Prawn and Prawn-to: John McCaffreyPDF Generation in Rails with Prawn and Prawn-to: John McCaffrey
PDF Generation in Rails with Prawn and Prawn-to: John McCaffreyJohn McCaffrey
 
FileType.pdf
FileType.pdfFileType.pdf
FileType.pdfqqlove2
 

Similar to PDF/A as a Better Format Than TIFF for OAIS Submission Packages (20)

What is PDF/A?
What is PDF/A?What is PDF/A?
What is PDF/A?
 
Pdfa Keynote
Pdfa KeynotePdfa Keynote
Pdfa Keynote
 
What is PDF/X?
What is PDF/X? What is PDF/X?
What is PDF/X?
 
The importance of standards
The importance of standardsThe importance of standards
The importance of standards
 
January 2006 Archival Storage Strategies and Technologies Presentation
January 2006 Archival Storage Strategies and Technologies PresentationJanuary 2006 Archival Storage Strategies and Technologies Presentation
January 2006 Archival Storage Strategies and Technologies Presentation
 
Pdfa 2 rome-fanning
Pdfa 2 rome-fanningPdfa 2 rome-fanning
Pdfa 2 rome-fanning
 
October 2006 Impact of PDF/A on Content Management by Christy Hubbard
October 2006 Impact of PDF/A on Content Management by Christy HubbardOctober 2006 Impact of PDF/A on Content Management by Christy Hubbard
October 2006 Impact of PDF/A on Content Management by Christy Hubbard
 
Presentation1
Presentation1Presentation1
Presentation1
 
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File FormatsPDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
 
What is PDF/A?
What is PDF/A?What is PDF/A?
What is PDF/A?
 
print media - file formats - LO1
print media - file formats - LO1print media - file formats - LO1
print media - file formats - LO1
 
PRESENTATION: Challenges of Digitization (November 2012)
PRESENTATION: Challenges of Digitization (November 2012)PRESENTATION: Challenges of Digitization (November 2012)
PRESENTATION: Challenges of Digitization (November 2012)
 
PDF/Archive: Preserving Electronic Assets
PDF/Archive: Preserving Electronic AssetsPDF/Archive: Preserving Electronic Assets
PDF/Archive: Preserving Electronic Assets
 
Different file types
Different file typesDifferent file types
Different file types
 
Apago Pdfx Nyc Seminar Fs Presentation
Apago Pdfx Nyc Seminar Fs PresentationApago Pdfx Nyc Seminar Fs Presentation
Apago Pdfx Nyc Seminar Fs Presentation
 
PDF
PDFPDF
PDF
 
PDF Generation in Rails with Prawn and Prawn-to: John McCaffrey
PDF Generation in Rails with Prawn and Prawn-to: John McCaffreyPDF Generation in Rails with Prawn and Prawn-to: John McCaffrey
PDF Generation in Rails with Prawn and Prawn-to: John McCaffrey
 
 
 
FileType.pdf
FileType.pdfFileType.pdf
FileType.pdf
 

Recently uploaded

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 

Recently uploaded (20)

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 

PDF/A as a Better Format Than TIFF for OAIS Submission Packages

  • 1. Beyond TIFF and JPEG2000: PDF/A as an OAIS submission information package container Presentation of Massoud Mortazavi Student of MS in Information Science SBU University Master Name: Mrs. Pakdaman Han, Y. (2015). Beyond TIFF and JPEG2000: PDF/A as an OAIS submission information package container. Library Hi Tech, 409 - 423. HTTP://DX.DOI.ORG/10.1108/LHT-06-2015-0068
  • 2. Abstract  Purpose introduce PDF/A to replace TIFF as the preferred file format for digitization of textual documents  Methodology reviewed the current digitization guidelines, the OAIS model and provides on an overview of the development PDF and PDF/A as international standards. 2
  • 3. Abstract  Findings TIFF file format has been the preferred master file format PDF/A has been the preferred standard for coding born-digital documents PDF/A can be used as an OAIS SIP container  Background More Than 20 Years Digitalization's In Libraries Digital Library Federation (DLF) have published several critical digitization guidelines 3
  • 4. Standardization of PDF as PDF/A Format Standardization of PDF as PDF/A Format Started in 2005: PDF/A-1: (PDF 1/4): ISO 19005-1:2005 PDF/A-2: (ISO 32000-1): ISO 19005-2: 2011 PDF/A-3: Use of ISO 32000-1 with support for embedded files (PDF/A-3)(PDF 1/7): ISO 19005-3: 2013 4
  • 5. 5
  • 6. PDF Versions PDF 1.4: Version 1.4 was the basis for the first versions of ISO standards PDF/X and PDF/A PDF 1.7: The original version 1.7 of the PDF format was released November 2006 and associated with Acrobat and Adobe Reader 8.0. Version 1.7 was published as ISO 32000-1 in July 2008 6
  • 7. PDF/A as an OAIS SIP container  The key requirement of PDF/A is that it is self-described and self-contained so that it can be reproduced exactly the same way with different software in various platforms.  All of the information necessary for displaying the document is embedded in the PDF/A file.  text, raster images and vector graphics, fonts and color profiles 7
  • 8. PDF/A as an OAIS SIP container (1) tagged PDF: embed structural metadata via pre-defined PDF tags or create your own tags; (2) self-contained: embed required color profiles, fonts and other related information; and (3) self-described using extensible metadata platform (XMP) metadata: PDF/A can code all the required information from an OAIS SIP through the standard and XMP. 8
  • 9. TIFF As a good Format  For the past 20 years TIFF 6.0 has been the preferred master file format for digitization due to a few factors such as availability of the technical specification and easy-to-understand file structure.  TIFF is very simple  easy to repair  Migrate 9
  • 10. The Problems of TIFF  it cannot include layers and JPEG (Its Not True)  TIFF 6.0 is an open standard (But it Should Use a License that its not Open Standard, actually its not OPEN STANDARD)  Big File Size  Inflexible for web and mobile delivery  Indexing is difficult  OCR, XMP, ALTO XML is not Supported  METS Not Supported (?)  TIFF tags are difficult to work with 10
  • 11. What about PDF?  Open International Standards  Self-contained and self-described.  Flexible  Space saving  Better metadata support with XMP  Other files or data. PDF/A-3 has the ability to have any file or data encoded 11
  • 13. Summary  PDF and PDF/A as international standards since 2005  PDF/A has been widely accepted as the preferred master file format for born- digital documents, but it has not been recommended for digitization  Every PDF/A Formats (1,2,3) Can be Used for Some Digitalization  The author shows how PDF/A is a better file format than current preferred TIFF and JPEG2000 13
  • 14. References  Guidelines for TIFF Metadata Recommended Elements and Format  http://www.iso.org  http://www.digitalpreservation.gov  The Use of PDF in Digital Archives  The Use of PDF/A in Digital Archives: A Case Study from Archaeology  https://en.wikipedia.org/wiki/Extensible_Metadata_Platform 14
  • 15. 15