Digital Reformatting and Digital Preservation/Curation/Stewardship                            Keri Thompson :: Web Service...
aka DigitizationWorks great for     access            Keri Thompson :: Web Services Department :: Smithsonian Institution ...
    Imaging basics          Resolution ppi/dpi          Bit depth          Color vs. grayscale vs. b&w          File ...
   Imaging for Books     Metadata, metadata, metadata      ▪ Descriptive      ▪ Structural      ▪ Rights/Administrative ...
Digital Capture Best Practices   METS ,PREMIS, embedded    metadata   FADGI    http://www.digitizationguidelines.gov/   ...
Keri Thompson :: Web Services Department :: Smithsonian Institution Libraries                       www.sil.si.edu :: thom...
   Be involved    in the    process!                  Keri Thompson :: Web Services Department :: Smithsonian Institution...
    So, you’ve scanned a                         book…what’s next?Keri Thompson :: Web Services Department :: Smithsonian...
“The series of managed            activities     necessary to ensure        continued accessto digital materials for as lo...
Object                                         Definition                                         Authenticity          ...
Issues are as much organizational as technological.                    Keri Thompson :: Web Services Department :: Smithso...
   Technical     Authenticity and checksums     Systems and media     Access methods                       Keri Thomps...
Digital Imaging Primer from Cornellhttp://www.library.cornell.edu/preservation/tutorial/contents.htmlFADGI Federal Agencie...
Thank You!               Keri ThompsonSmithsonian Institution Libraries www.sil.si.edu    thompsonk@si.edu :: @DigiKeri_SIL
Upcoming SlideShare
Loading in …5
×

Digital reformatting for_preservation2012

392 views

Published on

Presentation for CUA Preservation class. Overview of digital reformatting (digitization) and digital preservation/curation issues. 2012 version

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
392
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Digital reformatting – sure Digital preservation – ok, this is a little tricky Digital reformatting for preservation … is it Digital reformatting and preservation?
  • Reformatting reduces demand for/wear on physical books, but is not generally done for preservation but rather for access. If you are reformatting for preservation, similar issues but will want to keep in mind doing highest quality possible to retain maximum info, following digital preservation practices for resulting digital object. (For some disciplines (hist of the book etc) digital object won’t be satisfactory replacement for physical volume) In order create a good surrogate, understand how digital item will be used. Text mining? Verifying citations? Reading for fun? Scientific/artistic analysis of plates? Helps determine which aspects you concentrate on when digitizing, standards followed (resolution, color or bitonal, etc). Boutique vs. mass digitization, economies of scale. If $ not a problem, then can follow all the best practices & highest standards. If it is, then pick and choose (cheap, fast, good) Scanning just the first step, arguably the easiest and most understood. Maintenance of the resulting digital object – not just in the digital preservation/curation sense but also enhancement, error correction, and general management takes work & resources.
  • What are we doing when we create a digital image? Explain – divide picture area into grid, store color info for each grid space (pixel) Resolution =#of pixels used to represent each n area of the original, ppi -true ppi calculate against book size or character size (10x5” book, should be 3000x1500px if scanned at 300ppi) Bitdepth =number of bits used to represent each pixel color (stored in a byte=8bits), enables the capture of more gray shades or color tones. 8 bit is basic, 48 or 64 bit = more info (= bigger files) Color =Few still doing bitonal (EXCEPT GOOG) most doing halftone or color. Choose based on need. Bitonal good for text on new docs. Halftone (gray) good for text documents with damage, spots. Color good for color. Plus, you can alter color docs to b&w or gray if you want. When doing color, imp to choose colorspace (?) and calibrate regularly ‘golden thread’ or similar calibration tools. Many still shoot each page with a color reference card. File formats- still mostly tiff. Some save RAW (proprietary to each camera manufacturer=decoding issues later?) jpeg2000 iso standard, still controversial, but has many advantages.
  • Imaging for Books = not just images! Metadata, metadata, metadata Regular library stuff, like MARC, DC or MODS Structural metadata Rights metadata OCR – one of life’s little frustrations, the heartbreak of the long ‘s’ planning for manual OCR correction while we wait for folks to develop better engines Additional file formats – you have your images. Now make pdfs, epub formats. Making real epubs takes more work than just making a document readable on an ebook device.
  • Digital Capture Best Practices Resolution – true ppi, 400ppi or 600! 300 minimum Color fidelity (color space, bit depth) Camera, monitor, and target calibration FADGI http://www.digitizationguidelines.gov/ File formats – RAW, tiff, jpg2000 Filenaming conventions – keep related files together in worst case scenario (filename part of IPTC/XMP) Embedded metadata IPTC or XMP – technical is automatic, space for admin/rights , descriptive (basic) METS and PREMIS create “self describing” information packets that include the image files
  • Decision based on Quality vs speed less imp than Type of material being scanned (sheetfeeders, maps), condition of items being scanned Equipment choices: Camera on a stand (scanback, hi res $$) ||Flatbed scanners ||Overhead scanners || Dual-camera models||Robots vs. humans! Lighting – flash vs. continuous, color temp Page curvature & Depth of field issues Outsourcing ?????? MegaPixel = how many pixels are available on the sensor (camera back, scanner) surface for recording info from the original. 8MP cameras will usually do fine (300ppi) for quarto sized books. For Folios you need a bigger camera.
  • Be involved in the process! Equipment evaluation lighting (strobes vs. hot lights) things with platens, flatbeds, etc. training scanners for book handling, reviewing condition making sure conservation/repair/rehousing is part of the workflow (minimum at least indicate what items need treatment)
  • Store it somewhere. Make it findable and usable. Master file, derivatives, other derivative files – based on use cases and user needs. This may vary with material type (rb, mss) or by discipline (lit& humanities vs. soc sci vs. sci) Manage the lifecycle of your new digital object and ensure you can continue to make it available – hey, that’s…
  • Use of “preservation” is a little misleading to those in library-land not about conservation or restoration not about backup procedures or media on which data is stored no concept of “keeping it for 500 years” or any fixed period of permanence will often hear people use Digital Curation interchangeably, Digital Stewardship also gaining traction Continuous evolving process, not one time action. More like housework than building a house. No fixed time.
  • Definition: Digital objects can be very complex, website w/ media, or simple but dependant on hardware/software WordStar document. Even simple digitized books need all pages in order, included descriptive metadata. How object is defined must include context so can be correctly interpreted in future Authenticity: Media & systems on which digital objects are stored have uncertain lifespan. Need to plan for migration, assure entire object is being safely migrated and is the same after migration – ensure authenticity Digital objects are easy to change, either maliciously or accidentally – made of 1s and 0s! Bit rot! Oopsies Access : if they can’t find it, what’s the point? Metadata, systems Need robust metadata to describe the object so it can be found AND used AND understood. Continuing access for as long as necessary  migration (see above) Planning: includes creating and maintaining preservation plans, DAMPs Management: putting organizational structure in place to maintain and manage systems and objects OAIS model : high level conceptual framework that guides an organization’s implementation of digital preservation practices. also covers the technical issues, but in a broad way. Preservation planning and institutional commitment. OAIS : (is an ISO standard, but no way of ‘certifying’ if orgs/systems are implementing it, or implementing it well.)"an archive, consisting of an organization of people and systems, that has accepted the responsibility to preserve information and make it available for a Designated Community." Where "The information being maintained is deemed to need Long Term Preservation, even if the OAIS itself is not permanent. Long Term is long enough to be concerned with the impacts of changing technologies, including support for new media and data formats, or with a changing user community". http://www.paradigm.ac.uk/workbook/introduction/oais.html
  • Planning and documenting, including documenting methodology (why we are doing x this way) Organizational permanence (will your department be around in 20 years, even if your institution will be…) Responsibilities Creation & Administration of preservation plans Creation & management of DAMPs Choosing standards Data management Importance of periodic reevaluation of methodologies, standards that are applied to make sure they are working, still relevant. Plans are living documents, need to build in necessity of review & revision. Preservation planning concerns preservation of accessibility and readability of data. The functions of preservation planning address technical issues like recommendations for file format standards, monitoring changes in technology, evaluating content of a digital archive. The preservation plan should reflect what current strategies would preserve access to content in the best possible way. Selecting a suitable solution, by using different tools, makes it possible to implement a specific method
  • Handled by systems, good metadata and carrying out a well developed plan. Systems use checksum for data authenticity, other security pieces to ensure no tampering. Need context to know what your ‘authentic’ object is, so metadata is part of authenticity. Systems need to have built in periodic authenticity checking Hardware/software is available that support OAIS model – including DSPACE, FEDORA,LOCKSS Make sure files are findable & accessable by having good metadata (again!) includes standards like PREMIS, but also standard practices like assigning DOIs or URIs, having clear filenaming conventions, careful migration from system to system. Access may include joining Cooperatives/Services – it’s too much for one little org to go it alone! Datavers, MetaArchive, LOCKSS Internet Archive, DuraSpace/DuraCloud
  • Digital reformatting for_preservation2012

    1. 1. Digital Reformatting and Digital Preservation/Curation/Stewardship Keri Thompson :: Web Services Department :: Smithsonian Institution Libraries www.sil.si.edu :: thompsonk@si.edu :: @DigiKeri_SIL March 22, 2011
    2. 2. aka DigitizationWorks great for access Keri Thompson :: Web Services Department :: Smithsonian Institution Libraries www.sil.si.edu :: thompsonk@si.edu :: @DigiKeri_SIL
    3. 3.  Imaging basics  Resolution ppi/dpi  Bit depth  Color vs. grayscale vs. b&w  File formats & compressionKeri Thompson :: Web Services Department :: Smithsonian Institution Libraries www.sil.si.edu :: thompsonk@si.edu :: @DigiKeri_SIL
    4. 4.  Imaging for Books  Metadata, metadata, metadata ▪ Descriptive ▪ Structural ▪ Rights/Administrative  OCR  File formats Keri Thompson :: Web Services Department :: Smithsonian Institution Libraries www.sil.si.edu :: thompsonk@si.edu :: @DigiKeri_SIL
    5. 5. Digital Capture Best Practices METS ,PREMIS, embedded metadata FADGI http://www.digitizationguidelines.gov/ Keri Thompson :: Web Services Department :: Smithsonian Institution Libraries www.sil.si.edu :: thompsonk@si.edu :: @DigiKeri_SIL
    6. 6. Keri Thompson :: Web Services Department :: Smithsonian Institution Libraries www.sil.si.edu :: thompsonk@si.edu :: @DigiKeri_SIL
    7. 7.  Be involved in the process! Keri Thompson :: Web Services Department :: Smithsonian Institution Libraries www.sil.si.edu :: thompsonk@si.edu :: @DigiKeri_SIL
    8. 8.  So, you’ve scanned a book…what’s next?Keri Thompson :: Web Services Department :: Smithsonian Institution Libraries www.sil.si.edu :: thompsonk@si.edu :: @DigiKeri_SIL
    9. 9. “The series of managed activities necessary to ensure continued accessto digital materials for as long as necessary.” Keri Thompson :: Web Services Department :: Smithsonian Institution Libraries www.sil.si.edu :: thompsonk@si.edu :: @DigiKeri_SIL
    10. 10. Object  Definition  Authenticity  Access Process  Planning  Management  OAIS model (ISO 14721:2003)Keri Thompson :: Web Services Department :: Smithsonian Institution Libraries www.sil.si.edu :: thompsonk@si.edu :: @DigiKeri_SIL
    11. 11. Issues are as much organizational as technological. Keri Thompson :: Web Services Department :: Smithsonian Institution Libraries www.sil.si.edu :: thompsonk@si.edu :: @DigiKeri_SIL
    12. 12.  Technical  Authenticity and checksums  Systems and media  Access methods Keri Thompson :: Web Services Department :: Smithsonian Institution Libraries www.sil.si.edu :: thompsonk@si.edu :: @DigiKeri_SIL
    13. 13. Digital Imaging Primer from Cornellhttp://www.library.cornell.edu/preservation/tutorial/contents.htmlFADGI Federal Agencies Digitization Guideline Initiativewww.digitizationguidelines.gov/NDIIP National Digital Information Infrastructure & Preservation Program (LC)http://www.digitalpreservation.gov/Digital Preservation Primer by Michael Day, UKOLN via JISC (UK)http://www.slideshare.net/michaelday/digital-preservation-an-introductionAPA Alliance for Permanent Access (EU)http://www.alliancepermanentaccess.org/Digital Preservation Coalition (UK) (includes Digital Preservation Handbook)http://www.dpconline.org/DMP Tool (for grant proposals, minimal but a place to start)https://dmp.cdlib.org/
    14. 14. Thank You! Keri ThompsonSmithsonian Institution Libraries www.sil.si.edu thompsonk@si.edu :: @DigiKeri_SIL

    ×