SlideShare a Scribd company logo
Nevada Digital Newspaper Project
Dana Bullinger (Project Coordinator) and Melissa Stoner (Project Technician)
PHASE ONE
Title Selection
● Advisory Board selects qualified titles
○ Research Value
○ Geographic Representation
○ Temporal Coverage
○ Diversity
NDNP Title Guidelines
●Complete (or majority of) title run should be available
on microfilm without restrictions
●Technical factors to consider:
○ Quality of original text and microfilm capture
○ Reduction ratio (lower the reduction ratio, the better, below 20x)
○ Camera master negative microfilm duplicated should have a resolution
test patterns readable at 5.0 or higher
○ Variations of no more than 0.2 within images and between exposures
○ Confidence level through OCR testing of sample page images
Deliverables
For Each Title
•Up-to-date MARC record from the
CONSER OCLC database
•Additional title-level metadata (Reel-Level
Metadata spreadsheet example)
•Newspaper History Essay - 500 words per
title
For each issue
•Structural metadata for issues digitized and
organized by date (Page-Level Metadata
spreadsheet example)
Deliverables
For each newspaper page
- Page image in two formats
- Grayscale, scanned between 300-
400 dpi, uncompressed TIFF 6.0
image file
- Same image, compressed as
JPEG2000 (.JP2)
- OCR text using the ALTO schema
(1 file per page)
- PDF image with Hidden Text
PHASE TWO
Selected Titles
● Research Library of
Congress Control Numbers
CCNs and OCLC numbers
for all titles
● Accurate LCCNs critical for
data management
● Fill in spreadsheet
● Send to LC for approval
Before Duplication Begins...
●Set up purchase order with selected
digitization vendor (iArchives)
●Research and order microfilm reader
●Send work plan to NEH
●Order 10 1-TB Hard Drives for our
deliverables
Microfilm Reader and Software
•14MP Image Sensor
•Light Source
•File Output
•Lens with 7x to 105x
magnification
Sample Batch
● Sample batch allows Library of Congress to
identify any potential problems and ensures
technical specifications are being implemented
● Tonopah Daily Bonanza (1901-1903)
● Negative and Positive Reels duplicated by
NSLA and sent to UNLV
● Apply LC-provided barcodes on Negative Reel
boxes
○ Barcode connects digital content to physical
reel deposited at LC
MasterFile
●Document everything in the MasterFile and Reel-Level
Spreadsheet
○ Title, Year, LCCN, Barcode/Reel Number, Unique name for iArchives,
metadata received from NSLA
Collation: Reel-Level
UNLV NSLA
Unique Name Title
LCCN Source Repository
Reel-Number Density Readings
Location of Publication Reduction Ratio
Start/End date Average Density
Digital Responsible
Institution
Collation: Page-Level
● Use template
● One page-level spreadsheet = one reel
● Page count
● Anomalies
- Missing issues or pages
- Duplicate issues or pages
- Mutilated pages
- Other abnormalities (e.g. pages out of
order,incorrect dates)
Quality Review: before deliver to vendor
● Re-visit collation sheet and reel
metadata line-by-line
● Confirm for accuracy
● Check delivered page count against
● Check all notation for standardization
and clarity
● Metadata property formatted
iArchives
● iArchives Portal
○ Upload Reel and Page-level in a
.CSV file
● Ship Negative reels and blank hard
drive to be digitized
Scanning Specifications
● Scan from clean second-
generation duplicate silver
negative microfilm (to be
deposited at the Library of
Congress at the end of the award
period)
● Capture specifications are 8-bit
grayscale, between 300 and 400
dpi
● Target film strip should be
scanned at the start of each
session
● Provide the master page images,
delivered to LC, as uncompressed
images in TIFF 6.0 format
PHASE THREE
Back to UNLV
●Receive hard
drive
●Batch Structure
Quality Review
- Quality Review process ensures that NDNP Specifications are met
by checking for image quality, irregularities, and correct
bibliographic software
- Digital Viewer and Validator
(DVV)
- Allows awardees and
vendors to view data and
validate technical aspects of
files
- Verification checks digital
signatures of all files in a batch
Quality Review
● Verify Batch
● Double check dates using Calendar View
in DVV, cross reference with Reel-Level
and Page-Level data
● View thumbnails
● Check OCR (10% of pages)
● Verify Batch with DVV for a second time
● Email Tonijala Penn (LC Liaison) and Deb
Thomas (Project Coordinator for NDNP)
Library of Congress
● Ship to LC
○ Hard Drive
○ Shipping Manifest
○ Use fluorescent stickers!
● Receives and processes batch
● 6-8 weeks turnaround time
● If accepted, batch is ingested
into Chronicling America
Totals to date

More Related Content

Similar to Digitizing Nevada Newspapers: Workflow

Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid
Karthik Deivasigamani
 
1570514051.pptx
1570514051.pptx1570514051.pptx
1570514051.pptx
ssuser3855be
 
Open Source North - MongoDB Advanced Schema Design Patterns
Open Source North - MongoDB Advanced Schema Design PatternsOpen Source North - MongoDB Advanced Schema Design Patterns
Open Source North - MongoDB Advanced Schema Design Patterns
Matthew Kalan
 
Technologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic RecordsTechnologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic Records
pbajcsy
 
Globecom 2015: Adaptive Raptor Carousel for 802.11
Globecom 2015: Adaptive Raptor Carousel for 802.11Globecom 2015: Adaptive Raptor Carousel for 802.11
Globecom 2015: Adaptive Raptor Carousel for 802.11
Andrew Nix
 
Towards Data Operations
Towards Data OperationsTowards Data Operations
Towards Data Operations
Andrea Monacchi
 
Miniscule Digital Camera Hardware Design (1.18” x 1.18” 1.96”) - Teq Diligent...
Miniscule Digital Camera Hardware Design (1.18” x 1.18” 1.96”) - Teq Diligent...Miniscule Digital Camera Hardware Design (1.18” x 1.18” 1.96”) - Teq Diligent...
Miniscule Digital Camera Hardware Design (1.18” x 1.18” 1.96”) - Teq Diligent...
Teq Diligent
 
Mobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameter
Mobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameterMobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameter
Mobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameter
telestax
 
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player EnvironmentPolicy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Alpen-Adria-Universität
 
Chapter 3 Computer Hardware
Chapter 3 Computer HardwareChapter 3 Computer Hardware
Chapter 3 Computer Hardware
shelly3160
 
Kraken mesoscon 2018
Kraken mesoscon 2018Kraken mesoscon 2018
Kraken mesoscon 2018
joeyzhang1989928
 
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player EnvironmentPolicy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Minh Nguyen
 
Scanning 101 Standards
Scanning 101 StandardsScanning 101 Standards
Scanning 101 Standards
Jenel Farrell
 
Continuous Performance Testing
Continuous Performance TestingContinuous Performance Testing
Continuous Performance Testing
C4Media
 
OSMC 2021 | Handling 250K flows per second with OpenNMS: a case study
OSMC 2021 | Handling 250K flows per second with OpenNMS: a case studyOSMC 2021 | Handling 250K flows per second with OpenNMS: a case study
OSMC 2021 | Handling 250K flows per second with OpenNMS: a case study
NETWAYS
 
Analysis of KinectFusion
Analysis of KinectFusionAnalysis of KinectFusion
Analysis of KinectFusion
Dong-Won Shin
 
.NET Core Summer event 2019 in Brno, CZ - .NET Core Networking stack and perf...
.NET Core Summer event 2019 in Brno, CZ - .NET Core Networking stack and perf....NET Core Summer event 2019 in Brno, CZ - .NET Core Networking stack and perf...
.NET Core Summer event 2019 in Brno, CZ - .NET Core Networking stack and perf...
Karel Zikmund
 
An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...
An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...
An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...
Tanya Vernitsky
 
An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...
An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...
An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...
Bitmovin Inc
 
Key Aspects in 3D File Format Conversions
Key Aspects in 3D File Format ConversionsKey Aspects in 3D File Format Conversions
Key Aspects in 3D File Format Conversions
pbajcsy
 

Similar to Digitizing Nevada Newspapers: Workflow (20)

Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid Realtime classroom analytics powered by apache druid
Realtime classroom analytics powered by apache druid
 
1570514051.pptx
1570514051.pptx1570514051.pptx
1570514051.pptx
 
Open Source North - MongoDB Advanced Schema Design Patterns
Open Source North - MongoDB Advanced Schema Design PatternsOpen Source North - MongoDB Advanced Schema Design Patterns
Open Source North - MongoDB Advanced Schema Design Patterns
 
Technologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic RecordsTechnologies For Appraising and Managing Electronic Records
Technologies For Appraising and Managing Electronic Records
 
Globecom 2015: Adaptive Raptor Carousel for 802.11
Globecom 2015: Adaptive Raptor Carousel for 802.11Globecom 2015: Adaptive Raptor Carousel for 802.11
Globecom 2015: Adaptive Raptor Carousel for 802.11
 
Towards Data Operations
Towards Data OperationsTowards Data Operations
Towards Data Operations
 
Miniscule Digital Camera Hardware Design (1.18” x 1.18” 1.96”) - Teq Diligent...
Miniscule Digital Camera Hardware Design (1.18” x 1.18” 1.96”) - Teq Diligent...Miniscule Digital Camera Hardware Design (1.18” x 1.18” 1.96”) - Teq Diligent...
Miniscule Digital Camera Hardware Design (1.18” x 1.18” 1.96”) - Teq Diligent...
 
Mobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameter
Mobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameterMobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameter
Mobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameter
 
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player EnvironmentPolicy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
 
Chapter 3 Computer Hardware
Chapter 3 Computer HardwareChapter 3 Computer Hardware
Chapter 3 Computer Hardware
 
Kraken mesoscon 2018
Kraken mesoscon 2018Kraken mesoscon 2018
Kraken mesoscon 2018
 
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player EnvironmentPolicy-Driven Dynamic HTTP Adaptive Streaming Player Environment
Policy-Driven Dynamic HTTP Adaptive Streaming Player Environment
 
Scanning 101 Standards
Scanning 101 StandardsScanning 101 Standards
Scanning 101 Standards
 
Continuous Performance Testing
Continuous Performance TestingContinuous Performance Testing
Continuous Performance Testing
 
OSMC 2021 | Handling 250K flows per second with OpenNMS: a case study
OSMC 2021 | Handling 250K flows per second with OpenNMS: a case studyOSMC 2021 | Handling 250K flows per second with OpenNMS: a case study
OSMC 2021 | Handling 250K flows per second with OpenNMS: a case study
 
Analysis of KinectFusion
Analysis of KinectFusionAnalysis of KinectFusion
Analysis of KinectFusion
 
.NET Core Summer event 2019 in Brno, CZ - .NET Core Networking stack and perf...
.NET Core Summer event 2019 in Brno, CZ - .NET Core Networking stack and perf....NET Core Summer event 2019 in Brno, CZ - .NET Core Networking stack and perf...
.NET Core Summer event 2019 in Brno, CZ - .NET Core Networking stack and perf...
 
An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...
An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...
An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...
 
An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...
An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...
An Introduction to AV1 - The Next-Gen Royalty-Free Codec From the Alliance fo...
 
Key Aspects in 3D File Format Conversions
Key Aspects in 3D File Format ConversionsKey Aspects in 3D File Format Conversions
Key Aspects in 3D File Format Conversions
 

More from Nevada Digital Newspaper Project

Nevada Digital Newspaper Project and Chronicling America Demo
Nevada Digital Newspaper Project and Chronicling America Demo Nevada Digital Newspaper Project and Chronicling America Demo
Nevada Digital Newspaper Project and Chronicling America Demo
Nevada Digital Newspaper Project
 
NVDNP Project Update: Feb 2018
NVDNP Project Update: Feb 2018NVDNP Project Update: Feb 2018
NVDNP Project Update: Feb 2018
Nevada Digital Newspaper Project
 
Nevada Digital Newspaper Project Midterm Status
Nevada Digital Newspaper Project Midterm StatusNevada Digital Newspaper Project Midterm Status
Nevada Digital Newspaper Project Midterm Status
Nevada Digital Newspaper Project
 
NVDNP Progress Update (infographic)
NVDNP Progress Update (infographic)NVDNP Progress Update (infographic)
NVDNP Progress Update (infographic)
Nevada Digital Newspaper Project
 
Chronicling America Search Tips
Chronicling America Search TipsChronicling America Search Tips
Chronicling America Search Tips
Nevada Digital Newspaper Project
 
Digitizing Historic Newspapers: Workflow
Digitizing Historic Newspapers: WorkflowDigitizing Historic Newspapers: Workflow
Digitizing Historic Newspapers: Workflow
Nevada Digital Newspaper Project
 
Nevada’s Newspaper History
Nevada’s Newspaper HistoryNevada’s Newspaper History
Nevada’s Newspaper History
Nevada Digital Newspaper Project
 
Searching Chronicling America
Searching Chronicling AmericaSearching Chronicling America
Searching Chronicling America
Nevada Digital Newspaper Project
 

More from Nevada Digital Newspaper Project (8)

Nevada Digital Newspaper Project and Chronicling America Demo
Nevada Digital Newspaper Project and Chronicling America Demo Nevada Digital Newspaper Project and Chronicling America Demo
Nevada Digital Newspaper Project and Chronicling America Demo
 
NVDNP Project Update: Feb 2018
NVDNP Project Update: Feb 2018NVDNP Project Update: Feb 2018
NVDNP Project Update: Feb 2018
 
Nevada Digital Newspaper Project Midterm Status
Nevada Digital Newspaper Project Midterm StatusNevada Digital Newspaper Project Midterm Status
Nevada Digital Newspaper Project Midterm Status
 
NVDNP Progress Update (infographic)
NVDNP Progress Update (infographic)NVDNP Progress Update (infographic)
NVDNP Progress Update (infographic)
 
Chronicling America Search Tips
Chronicling America Search TipsChronicling America Search Tips
Chronicling America Search Tips
 
Digitizing Historic Newspapers: Workflow
Digitizing Historic Newspapers: WorkflowDigitizing Historic Newspapers: Workflow
Digitizing Historic Newspapers: Workflow
 
Nevada’s Newspaper History
Nevada’s Newspaper HistoryNevada’s Newspaper History
Nevada’s Newspaper History
 
Searching Chronicling America
Searching Chronicling AmericaSearching Chronicling America
Searching Chronicling America
 

Recently uploaded

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 

Recently uploaded (20)

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 

Digitizing Nevada Newspapers: Workflow

  • 1. Nevada Digital Newspaper Project Dana Bullinger (Project Coordinator) and Melissa Stoner (Project Technician)
  • 3. Title Selection ● Advisory Board selects qualified titles ○ Research Value ○ Geographic Representation ○ Temporal Coverage ○ Diversity
  • 4. NDNP Title Guidelines ●Complete (or majority of) title run should be available on microfilm without restrictions ●Technical factors to consider: ○ Quality of original text and microfilm capture ○ Reduction ratio (lower the reduction ratio, the better, below 20x) ○ Camera master negative microfilm duplicated should have a resolution test patterns readable at 5.0 or higher ○ Variations of no more than 0.2 within images and between exposures ○ Confidence level through OCR testing of sample page images
  • 5. Deliverables For Each Title •Up-to-date MARC record from the CONSER OCLC database •Additional title-level metadata (Reel-Level Metadata spreadsheet example) •Newspaper History Essay - 500 words per title For each issue •Structural metadata for issues digitized and organized by date (Page-Level Metadata spreadsheet example)
  • 6. Deliverables For each newspaper page - Page image in two formats - Grayscale, scanned between 300- 400 dpi, uncompressed TIFF 6.0 image file - Same image, compressed as JPEG2000 (.JP2) - OCR text using the ALTO schema (1 file per page) - PDF image with Hidden Text
  • 8. Selected Titles ● Research Library of Congress Control Numbers CCNs and OCLC numbers for all titles ● Accurate LCCNs critical for data management ● Fill in spreadsheet ● Send to LC for approval
  • 9. Before Duplication Begins... ●Set up purchase order with selected digitization vendor (iArchives) ●Research and order microfilm reader ●Send work plan to NEH ●Order 10 1-TB Hard Drives for our deliverables
  • 10. Microfilm Reader and Software •14MP Image Sensor •Light Source •File Output •Lens with 7x to 105x magnification
  • 11. Sample Batch ● Sample batch allows Library of Congress to identify any potential problems and ensures technical specifications are being implemented ● Tonopah Daily Bonanza (1901-1903) ● Negative and Positive Reels duplicated by NSLA and sent to UNLV ● Apply LC-provided barcodes on Negative Reel boxes ○ Barcode connects digital content to physical reel deposited at LC
  • 12. MasterFile ●Document everything in the MasterFile and Reel-Level Spreadsheet ○ Title, Year, LCCN, Barcode/Reel Number, Unique name for iArchives, metadata received from NSLA
  • 13. Collation: Reel-Level UNLV NSLA Unique Name Title LCCN Source Repository Reel-Number Density Readings Location of Publication Reduction Ratio Start/End date Average Density Digital Responsible Institution
  • 14. Collation: Page-Level ● Use template ● One page-level spreadsheet = one reel ● Page count ● Anomalies - Missing issues or pages - Duplicate issues or pages - Mutilated pages - Other abnormalities (e.g. pages out of order,incorrect dates)
  • 15. Quality Review: before deliver to vendor ● Re-visit collation sheet and reel metadata line-by-line ● Confirm for accuracy ● Check delivered page count against ● Check all notation for standardization and clarity ● Metadata property formatted
  • 16. iArchives ● iArchives Portal ○ Upload Reel and Page-level in a .CSV file ● Ship Negative reels and blank hard drive to be digitized
  • 17. Scanning Specifications ● Scan from clean second- generation duplicate silver negative microfilm (to be deposited at the Library of Congress at the end of the award period) ● Capture specifications are 8-bit grayscale, between 300 and 400 dpi ● Target film strip should be scanned at the start of each session ● Provide the master page images, delivered to LC, as uncompressed images in TIFF 6.0 format
  • 19. Back to UNLV ●Receive hard drive ●Batch Structure
  • 20. Quality Review - Quality Review process ensures that NDNP Specifications are met by checking for image quality, irregularities, and correct bibliographic software - Digital Viewer and Validator (DVV) - Allows awardees and vendors to view data and validate technical aspects of files - Verification checks digital signatures of all files in a batch
  • 21. Quality Review ● Verify Batch ● Double check dates using Calendar View in DVV, cross reference with Reel-Level and Page-Level data ● View thumbnails ● Check OCR (10% of pages) ● Verify Batch with DVV for a second time ● Email Tonijala Penn (LC Liaison) and Deb Thomas (Project Coordinator for NDNP)
  • 22. Library of Congress ● Ship to LC ○ Hard Drive ○ Shipping Manifest ○ Use fluorescent stickers! ● Receives and processes batch ● 6-8 weeks turnaround time ● If accepted, batch is ingested into Chronicling America

Editor's Notes

  1. M
  2. M
  3. M
  4. M In addition to the master TIFF image file and OCR text using the ALTO schema, the awardee institution will provide a searchable PDF (Portable Document Format) Image with Hidden Text for each page image and a JPEG2000 compressed image file (.JP2) PDFs will provide an image of the original page that can be conveniently printed and downloaded, supporting within-page searching for words, external to the NDNP search system. LC will use the separate OCR output file as the basis for search in its access interface. The PDF Image with Hidden Text can be created at the time of processing by the OCR application.
  5. M
  6. D
  7. D
  8. D
  9. D
  10. D
  11. M
  12. M
  13. M
  14. M
  15. M Newspapers microfilmed two sheets per frame should be split into two separate image files (and assigned appropriate metadata). To improve appearance and OCR accuracy, images that contain text blocks exhibiting more than 3 degrees of skew should be deskewed. Page image files should be cropped to the page edge (not to the text block boundaries), retaining the actual edge and up to ¼ inch beyond. In general, the goal of the NDNP cropping specification is to produce as complete a page image as possible in order to best enable long-term management and access needs into the future.
  16. D
  17. D Verify twice, once when it is received, and before it is shipped to LC
  18. D
  19. D
  20. D