Intro to Digitization Projects


Published on

A North Carolina Connecting to Collections (C2C) workshop co-taught by Audra Eagle Yun (WFU), Nicholas Graham (UNC), and Lisa Gregory (State Archives of NC). This workshop took place on June 13, 2011 in Wilson, NC.

Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • As curators of collections, we know the importance of description. Same thing applies online, perhaps more, with dominance of Google and increased real or perceived search saavy.
  • Coverage = geographic, temporal
  • Organized – A-Z Index + retrieve – faster and more relevant Preferred Limited scope Specific domain
  • Two thesauri examples that show some of these principles in action.
  • LOC partnered with Flickr in 2008 with the idea of a “Commons,” putting up over 3100 photos and encouraging users to tag them. The idea – they’ve got sooooo much stuff, and people want access to it online, yet doing full description by trained professionals is the big bottleneck in the system. So why not leverage the crowd? Here’s what they got.
  • They’ve been continuing to add a lot of content to flickr, and here’s an example. I’ve pulled a picture, and on the left is the LOC metadata. On the right are the user-designated tags. Highlighted are the ones that overlap – you can see it isn’t many. Call attention to: “deltacounty” (one word), “needle in a haystack” (idioms), “tyre” (alternate spellings)
  • Pretend we’re with a public library, which probably has to consider a pretty broad audience. Plus, this is material that might be used by kids, but isn’t specifically for kids. But we don’t want anything with acronyms – want terms that can be used by laypeople. Let’s go to a controlled vocabulary source, the Getty Research Institute’s Art & Architecture Thesaurus. Popular for describing visual materials. First, search using what we already know. Gelatin silver photograph print.
  • That’s the first term. What else?
  • It’s a wide photo – panorama?
  • City – starting to get into subject, but I’ll try it anyway
  • Cityscapes? Yes or no – judgment call. The other thing I wanted to point out is that it’s often helpful to browse your CV, especially if you’re not familiar with the topic. If you click on the little triangle of boxes in the AAT, you can do that.
  • So once I click on Visual works “Guide Term,” you can browse through the list. I see “photographs” and “photographs by form: color” which brings me to another applicable format term I could use – black-and-white photographs. (Guide term – a term used to collocate like concepts, but shouldn’t be applied within a CV)
  • So here’s what we’ve come up with.
  • Digitization provides greater access to materials, which may lead to the decision to preserve those files HOWEVER Digitization creates new digital objects that themselves require preservation Digitization creates metadata that requires preservation
  • The original painting (in this case the Mona Lisa) The digitized image The metadata The reality is that we are much more adept at this point in preserving analog objects like paintings and paper. That painting is over 500 years old. We can only dream that our digital files will last that long.
  • Again, lots of federal funding was directed towards the project. Do you have that kind of support? I know that we don’t. And, we’re lucky enough to have multiple staff...
  • Think about it up front.
  • Concerns regarding file formats include what media they’re saved on, and what software was used to create them. 5 ¼” floppies – drives aren’t available Wordstar – software isn’t compatible with current operating systems. When we no longer have the software to read them, we need to move them to an alternate format. This could result in data loss, changes in presentation, or may simply be impossible.
  • Proprietary software formats also offer a challenge. To handle files over time, we need to be able to read and possibly manipulate them to make sure they remain readable. If the company that originally created the file format goes out of business without divulging their source code, it can be incredibly difficult to still read it. Open source formats are preferable, because the source code has been made available.
  • So these are best practices for file formats: We all know that the State has made Microsoft products the standard, and in fact they’re the standard in general. However their file formats are proprietary, which can cause a preservation challenge. We can’t tell people not to use the tools they’re provided, but we can ask them…. Keep the original too (or better yet, send it to us)
  • Something near and dear to your hearts. The next issue we’d like to talk about is context, because files without context are adrift… Impress upon people the importance of keeping information about their files with their files. Metadata items – this may seem burdensome to people. Again, reinforce giving them to us. If we don’t know about it, it limits its usefulness. File names – intelligent identifiers are best demonstrated by an example…
  • Some dirty laundry – I discovered this folder out on the K drive earlier this week. This is work that someone did – and took a lot of care in doing – that we currently can’t use. We may be able to find the original object, but we won’t have any info on how the items were created or whether or not they’ve been manipulated. It’ll take more time to piece it together than it probably took to originally digitize.
  • .txt file in same directory or database that refers to file location SORTING!
  • special constraints Keeping access to a minimum to avoid accidental loss Something people always mention – especially pertinent b/c of older buildings a lot of agencies reside in. Much more prevalent is staff turnover – people don’t consider how to handle files in period of transition, often until employee is gone.
  • Intro to Digitization Projects

    1. 1. Preparing for a Digitization Project Wilson County Public Library June 13, 2011 Nicholas Graham Lisa Gregory Audra Eagle Yun
    2. 2. Agenda Welcome and Introductions About Connecting to Collections 10:00 – 10:45 Planning for a Digital Project Selecting and Evaluating Materials Copyright 10:45 – 11:15 Digitization Equipment and Expertise Standards and Guidelines 11:15 – 12:00 Description Evaluating Metadata Needs Metadata Standards and Controlled Vocabularies Creating a Data Dictionary 12:00 – 1:00 Lunch 1:00 – 1:30 Digital Publishing Free and Cheap Options Open Source and Homegrown Options CONTENTdm
    3. 3. Agenda, continued 1:30 – 2:00 Digital Preservation Long-term Care for your Digital Files 2:00 – 2:30 North Carolina Digital Heritage Center Services Offered by the NC Digital Heritage Center How to Develop a Project with the Digital Heritage Center 2:30 – 3:00 Questions and Discussion
    4. 5. Planning for a Digital Project <ul><li>Essential Components of a Successful Project </li></ul><ul><ul><li>Institutional Support </li></ul></ul><ul><ul><li>Community Support </li></ul></ul><ul><ul><li>Support from Other Institutions </li></ul></ul><ul><ul><li>Time, Energy, Curiosity, and Enthusiasm </li></ul></ul>
    5. 6. Planning for a Digital Project <ul><li>Deciding What to Digitize </li></ul><ul><ul><li>What do you have that nobody else does? </li></ul></ul><ul><ul><li>What’s the most difficult? </li></ul></ul><ul><ul><li>What’s the easiest? </li></ul></ul><ul><ul><li>What do you already have described? </li></ul></ul>
    6. 7. Planning for a Digital Project <ul><li>Evaluating Your Materials </li></ul><ul><ul><li>Does your institution own the materials you’re planning to digitize? </li></ul></ul><ul><ul><li>Are the materials in good enough condition to withstand digitization? </li></ul></ul>
    7. 8. Planning for a Digital Project <ul><li>Copyright </li></ul><ul><ul><li>Are the materials in the public domain? </li></ul></ul><ul><ul><li>Does your library own the rights to the materials? </li></ul></ul><ul><ul><li>Have you received permission from the rights holder? </li></ul></ul><ul><ul><li>Have you made an effort to locate the rights holder? </li></ul></ul><ul><ul><li>What is your institution’s risk tolerance? </li></ul></ul><ul><ul><li>Have a take-down policy. </li></ul></ul>
    8. 9. Digitization
    9. 10. Digitization <ul><li>Equipment and Expertise </li></ul><ul><ul><li>Creating a project team </li></ul></ul><ul><ul><ul><li>Roles and responsibilities </li></ul></ul></ul><ul><ul><ul><li>Managing staff and volunteers </li></ul></ul></ul><ul><ul><li>Creating a digital production station </li></ul></ul><ul><ul><ul><li>Creating your space </li></ul></ul></ul><ul><ul><ul><li>Choosing hardware </li></ul></ul></ul><ul><ul><ul><li>Choosing software </li></ul></ul></ul>
    10. 11. Equipment: Flatbed Scanner <ul><ul><li>Used for anything small and flat, including </li></ul></ul><ul><ul><ul><li>Loose photos </li></ul></ul></ul><ul><ul><ul><li>Postcards </li></ul></ul></ul><ul><ul><ul><li>Manuscripts </li></ul></ul></ul><ul><ul><ul><li>Currency </li></ul></ul></ul><ul><ul><li>Negatives often require special inserts and expertise </li></ul></ul><ul><ul><li>Cheap and easy to operate </li></ul></ul><ul><ul><li>Not as good for bound materials </li></ul></ul>Digital Production Center, UNC-Chapel Hill
    11. 12. Equipment: Overhead Document Scanner <ul><ul><li>Ideal for large manuscript collections </li></ul></ul><ul><ul><li>Adjustable surface allows for good image capture from bound materials </li></ul></ul><ul><ul><li>Fast and easy to operate </li></ul></ul><ul><ul><li>Expensive </li></ul></ul>Digital Production Center, UNC-Chapel Hill
    12. 13. Equipment: Book Scanner <ul><ul><li>Designed specifically for mass digitization of monographs </li></ul></ul><ul><ul><li>Very fast and effective </li></ul></ul><ul><ul><li>Expensive to lease </li></ul></ul><ul><ul><li>Outsourcing book digitization may be the best option for many organizations </li></ul></ul>Digital Production Center, UNC-Chapel Hill
    13. 14. Equipment: Sheet-fed Scanner <ul><ul><li>Great for loose, flat, small, and sturdy items (like catalog cards or loose papers) </li></ul></ul><ul><ul><li>Extremely fast (hundreds of scans per minute) </li></ul></ul><ul><ul><li>Not a good option for images, manuscripts, or any materials of varying size </li></ul></ul>Digital Production Center, UNC-Chapel Hill
    14. 15. Equipment: Digital Camera Back and Vacuum Table <ul><ul><li>Ideal for digitizing large and fragile flat items (great for maps) </li></ul></ul><ul><ul><li>Requires a good amount of training and expertise to operate </li></ul></ul><ul><ul><li>Expensive </li></ul></ul>Digital Production Center, UNC-Chapel Hill
    15. 16. Digitization <ul><li>The Scanning Process </li></ul><ul><ul><li>Specifications </li></ul></ul><ul><ul><ul><li>Scan once, use many times  </li></ul></ul></ul><ul><ul><ul><li>LOCKSS </li></ul></ul></ul><ul><ul><ul><li>See NC ECHO, NEDCC, and </li></ul></ul></ul><ul><ul><li>Format and resolution </li></ul></ul><ul><ul><ul><li>Text -- Master: 200 dpi TIFF; Access: 200 dpi JPG; PDF </li></ul></ul></ul><ul><ul><ul><li>Photos or Documents -- Master: 600 dpi TIFF; Access: 300 dpi JPG </li></ul></ul></ul><ul><ul><ul><li>Maps or Drawings -- Master: 300 dpi TIFF; Access: 200-300 dpi JPG </li></ul></ul></ul><ul><ul><ul><li>Video -- Master: AVI; Access: MPEG </li></ul></ul></ul><ul><ul><ul><li>Audio -- Master: WAV; Access: MP3; WMA </li></ul></ul></ul>
    16. 17. Digitization <ul><li>The Scanning Process </li></ul><ul><ul><li>Organizing files </li></ul></ul><ul><ul><ul><li>Naming </li></ul></ul></ul><ul><ul><ul><li>Storing </li></ul></ul></ul><ul><ul><li>Workflow </li></ul></ul><ul><ul><ul><li>Create project queue </li></ul></ul></ul><ul><ul><ul><li>Track digital production </li></ul></ul></ul><ul><ul><ul><li>Note metadata and other tasks </li></ul></ul></ul>
    17. 18. Metadata = Data about Data <ul><ul><li>What is it for? </li></ul></ul><ul><ul><li>What kinds? </li></ul></ul><ul><ul><li>What should you do? </li></ul></ul><ul><ul><li>Standards </li></ul></ul><ul><ul><li>Controlled vocabularies </li></ul></ul><ul><ul><li>Data dictionaries </li></ul></ul>
    18. 19. Metadata ONLINE <ul><ul><li>Vital that it’s… </li></ul></ul><ul><ul><ul><li>Shareable </li></ul></ul></ul><ul><ul><ul><li>Interoperable </li></ul></ul></ul><ul><ul><ul><li>Consistent </li></ul></ul></ul><ul><ul><ul><li>Audience appropriate </li></ul></ul></ul><ul><ul><ul><li>Appropriately complete </li></ul></ul></ul><ul><ul><li>Helps keep your digital content… </li></ul></ul><ul><ul><ul><li>Findable (by humans and machines) </li></ul></ul></ul><ul><ul><ul><li>Manageable </li></ul></ul></ul><ul><ul><ul><li>Authentic </li></ul></ul></ul>
    19. 20. Descriptive metadata <ul><ul><li>Supports user tasks </li></ul></ul><ul><ul><ul><li>Describes and identifies the object or the content of an object </li></ul></ul></ul><ul><ul><ul><li>Discovering/locating the object </li></ul></ul></ul><ul><ul><li>Closely aligned to MARC cataloging </li></ul></ul><ul><ul><li>Examples </li></ul></ul><ul><ul><ul><li>Title, Author, Date of creation, Subject, Free text description/note </li></ul></ul></ul><ul><ul><ul><li>MARC, Dublin Core, VRA Core </li></ul></ul></ul>
    20. 21. <ul><ul><li>Supports management tasks </li></ul></ul><ul><ul><li>Subcategories: </li></ul></ul><ul><ul><ul><li>technical - technical characteristics about the object </li></ul></ul></ul><ul><ul><ul><li>preservation - actions that have been performed on the object and source, custody of the object, provenance </li></ul></ul></ul><ul><ul><ul><li>rights - information about access and use of the object </li></ul></ul></ul><ul><ul><li>Examples </li></ul></ul><ul><ul><ul><li>File size, File name, Rights statement, Digital format </li></ul></ul></ul>Administrative metadata
    21. 22. <ul><ul><li>Supports long-term management and access to object </li></ul></ul><ul><ul><ul><li>May not display in user interface </li></ul></ul></ul><ul><ul><ul><li>Meta-metadata - information about the metadata itself; who created it, when, where it came from, when it was updated. </li></ul></ul></ul><ul><ul><li>Examples </li></ul></ul><ul><ul><ul><li>Bit depth, Checksum, File type, Owner, File creation date, Last modified date </li></ul></ul></ul><ul><ul><ul><li>PREMIS, NC-PMDO </li></ul></ul></ul>Preservation metadata
    22. 23. Structural metadata <ul><ul><li>Multi-part objects in the digital environment </li></ul></ul><ul><ul><li>Replicates the physical structure </li></ul></ul><ul><ul><ul><li>E.g., Paging/chaptering in digital books when each page is an image </li></ul></ul></ul><ul><ul><li>Describes the relationships between related objects </li></ul></ul><ul><ul><li>Examples </li></ul></ul><ul><ul><ul><li>Relationship, Page number, Chapter number, Total page numbers, File “order” </li></ul></ul></ul><ul><ul><ul><li>TEI </li></ul></ul></ul>
    23. 24. What should YOU do? <ul><ul><li>Consider your environment </li></ul></ul><ul><ul><li>Decide on standards/controlled vocabularies </li></ul></ul><ul><ul><li>Test them out on a few objects </li></ul></ul><ul><ul><li>Create a data dictionary </li></ul></ul><ul><ul><li>Describe, describe, describe </li></ul></ul>
    24. 25. Consider your users <ul><ul><li>Consider your users : </li></ul></ul><ul><ul><ul><li>Will they understand the terms in the CV you’ve selected to describe your collection? </li></ul></ul></ul>
    25. 26. Consider your community <ul><ul><li>Consider your community : </li></ul></ul><ul><ul><ul><li>If other medical libraries are describing their collections with MeSH, maybe you should, too. </li></ul></ul></ul><ul><ul><ul><ul><li>Less confusing for users </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Makes your collections more interoperable </li></ul></ul></ul></ul>
    26. 27. Consider your collection <ul><ul><li>Consider the nature and extent of the collection now and in the future: </li></ul></ul><ul><ul><ul><li>If the collection is small and discreet, you may not need a huge, complicated CV to describe it </li></ul></ul></ul>Flickr user define23
    27. 28. Consider your metadaters <ul><ul><li>Consider the skills and available time of your data creators : </li></ul></ul><ul><ul><ul><li>Will they understand the terms in the CV, or do you need a specialist to describe the materials? </li></ul></ul></ul><ul><ul><ul><li>What do they know about: </li></ul></ul></ul><ul><ul><ul><ul><li>Neuropathy? </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Neuroscience? </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Mitochondrial Dysfunction? </li></ul></ul></ul></ul><ul><ul><ul><li>Sometimes, you’ve just got other fish to fry </li></ul></ul></ul>
    28. 29. Metadata Standards: Examples Name Focus Description DDI Archiving and Social Science Data Documentation Initiative is an international effort to establish a standard for technical documentation describing social science data. A membership-based Alliance is developing the DDI specification, which is written in XML. EAD Archives Encoded Archival Description - a standard for encoding archival finding aids using XML in archival and manuscript repositories. CDWA Arts and Museums Categories for the Description of Works of Art is a conceptual framework for describing and accessing information about works of art, architecture, and other material culture. VRA Core Arts & Musuems Visual Resources Association – the standard provides a categorical organization for the description of works of visual culture as well as the images that document them. Darwin Core Biology Darwin Core is a metadata specification for information about the geographic occurrence of species and the existence of specimens in collections. TEI Humanities, social sciences & linguistics Text Encoding Initiative - a standard for the representation of texts in digital form, chiefly in the humanities, social sciences and linguistics. NISO MIX Images Z39.87 Data dictionary - technical metadata for digital still images (MIX) - NISO Metadata for Images in XML is an XML schema for a set of technical data elements required to manage digital image collections. MARC Librarianship MARC - MAchine Readable Cataloging - standards for the representation and communication of bibliographic and related information in machine-readable form. METS Librarianship Metadata Encoding and Transmission Standard - an XML schema for encoding descriptive, administrative, and structural metadata regarding objects within a digital library. MODS Librarianship Metadata Object Description Schema - is a schema for a bibliographic element set that may be used for a variety of purposes, and particularly for library applications. XOBIS Librarianship XML Organic Bibliographic Information Schema - a XML schema for modeling MARC data. MPEG-7 Multimedia MPEG-7 is a ISO/IEC standard and specifies a set of descriptors to describe various types of multimedia information and is developed by the Moving Picture Experts Group. Dublin Core Networked resources Dublin Core - interoperable online metadata standard focused on networked resources.
    29. 30.
    30. 31. Controlled Vocabulary (CV) Definition <ul><ul><li>An organized arrangement of words and phrases that are used to index content and/or to retrieve content through navigation or a search </li></ul></ul><ul><ul><ul><li>Typically includes preferred terms and has a limited scope or describes a specific domain </li></ul></ul></ul>
    31. 32. What CVs give us <ul><ul><li>Standard terminology </li></ul></ul><ul><ul><ul><li>Cows? Steers? Cattle? Livestock? </li></ul></ul></ul><ul><ul><li>Standard formatting </li></ul></ul><ul><ul><ul><li>Raleigh, NC? Raleigh, North Carolina </li></ul></ul></ul><ul><ul><li>Synonyms (non-hierarchical) </li></ul></ul><ul><ul><ul><li>“ USE FOR” </li></ul></ul></ul><ul><ul><li>Hierarchical relationships </li></ul></ul><ul><ul><ul><li>Photographs > black-and-white photographs </li></ul></ul></ul>
    32. 33. Controlled List: Examples
    33. 34. Thesauri: Examples Thesaurus of geographic names (TGN) Constantinople (USE İstanbul) İstanbul (USE FOR Constantinople) Performing arts BT: arts (broad discipline) NT: dance Art & Architecture Thesaurus (AAT)
    34. 35. Controlled Vocabularies: Examples <ul><ul><li>Some are freely available on the Web </li></ul></ul><ul><ul><ul><li>AAT </li></ul></ul></ul><ul><ul><ul><li>MESH </li></ul></ul></ul><ul><ul><ul><li>TGM </li></ul></ul></ul><ul><ul><ul><li>LC-NAF </li></ul></ul></ul><ul><ul><ul><li>Taxonomy Warehouse (clearinghouse) </li></ul></ul></ul><ul><ul><li>Some aren’t </li></ul></ul><ul><ul><ul><li>Library of Congress Subject Headings (online, called “Classification Web”) </li></ul></ul></ul>
    35. 36. Costs/Benefits – Build your own CV? <ul><ul><li>Resource intensive </li></ul></ul><ul><ul><li>Interoperable? </li></ul></ul><ul><ul><li>We’ll never share it . . . </li></ul></ul><ul><ul><li>We’re not experts </li></ul></ul><ul><ul><li>Local need </li></ul></ul><ul><ul><li>We’ll never share it… </li></ul></ul><ul><ul><li>Just the right size </li></ul></ul><ul><ul><li>We’re the experts </li></ul></ul>
    36. 37. Flickr and description <ul><ul><li>The Commons ( </li></ul></ul><ul><ul><li>Pilot with Library of Congress in 2008 </li></ul></ul>
    37. 38. Flickr and description
    38. 39. Assigning Metadata “ New York & bridges from Brooklyn” c. 1913 gelatin silver photographic print 9.5x34” How do we describe the format ?
    39. 41. Assigning Metadata
    40. 44. Assigning Metadata
    41. 46. ………………… .
    42. 47. Assigning Metadata black-and-white photographs gelatin silver prints panoramas
    43. 48. <ul><li>Elements </li></ul><ul><ul><li>Identifier (A) </li></ul></ul><ul><ul><li>Title (D) </li></ul></ul><ul><ul><li>Creator (D) </li></ul></ul><ul><ul><li>Contributor (D) </li></ul></ul><ul><ul><li>Publisher (D) </li></ul></ul><ul><ul><li>Subject (D) </li></ul></ul><ul><ul><li>Description (D) </li></ul></ul><ul><ul><li>Coverage (D) </li></ul></ul><ul><ul><li>Format (P/D) </li></ul></ul><ul><ul><li>Type (D) </li></ul></ul><ul><ul><li>Date (D) </li></ul></ul><ul><ul><li>Relation (S) </li></ul></ul><ul><ul><li>Source (D) </li></ul></ul><ul><ul><li>Rights (A) </li></ul></ul><ul><ul><li>Language (D) </li></ul></ul>
    44. 49. Subjects Bulls Fairs Men Cars Hay Hertford Cattle Horns Hats
    45. 50. Authorized Subject Headings (TGM) Bulls Cattle Beef cattle Livestock Agriculture Animals
    46. 51. Authorized Subject Headings (TGM) Fairs Livestock shows Politicians Automobiles Hay
    47. 52. Names Jim Graham James A. Graham Superintendent of Beef Stock North Carolina Secretary of Agriculture “ The Sodfather” Meadows Domino 66
    48. 53. Authorized Name (LCNAF) Graham, James A., 1921-
    49. 54. Example Metadata Record Title: Jim Graham with bull, Meadows Domino 66 Creator: Fern, Douglas M. Date: 1969 Subject:  Fairs; Livestock shows; Politicians; Automobiles; Hay; Graham, James A., 1921-; Description: Jim Graham, Superintendent of Beef Stock at the North Carolina State Fair stands with the 16 month-old Meadows Domino 66.  The bull was sold by J. Horton Doughton of Doughton Meadows Farm, Laurel Springs, N.C., to Mr. and Mrs. A.W. Fanjoy of Joy Acres Farm, Statesville, N.C., for $10,000. It won every class ever shown in except one and received second place there, and was the Grand Champion at the N.C. State Fair. At the time, it was the highest priced bull ever sold in N.C. Weight: 1450 lbs. Time Period [Coverage]: 20th century Location [Coverage]: Raleigh, N.C. (Wake County) Format: image/jpeg; 721 KB Type: Image Rights: This image may be under copyright. Please contact INSTITUTION NAME for permission to reproduce.  Identifier: agcoll_17.11.189.jpg Capture Date [Date]: 2011-06-05 Capture Tools: Epson Expression 10000XL; Metadata Creator: Gregory, Lisa
    50. 55. Data Dictionary
    51. 56. Digital Publishing <ul><li>Free and Cheap Options </li></ul><ul><ul><li>Flickr; Picasa </li></ul></ul><ul><ul><ul><li>Upload and edit on desktop or web </li></ul></ul></ul><ul><ul><ul><li>Social tagging and geotagging </li></ul></ul></ul><ul><ul><ul><li>Cloud-based service </li></ul></ul></ul><ul><ul><li>YouTube </li></ul></ul><ul><ul><ul><li>Upload and edit on web </li></ul></ul></ul><ul><ul><ul><li>Formats include AVI, MPEG, MOV </li></ul></ul></ul><ul><ul><ul><li>Cloud-based service </li></ul></ul></ul><ul><ul><li>Internet Archive </li></ul></ul><ul><ul><ul><li>Images, text, audio, and video </li></ul></ul></ul><ul><ul><ul><li>Formats include MPEG2, WAV, TXT, XML, PDF </li></ul></ul></ul><ul><ul><li>Blogs </li></ul></ul>
    52. 57. Digital Publishing <ul><li>Traditional Options </li></ul><ul><ul><li>Open-source Tools </li></ul></ul><ul><ul><li>“ Homegrown” Systems </li></ul></ul><ul><ul><li>CONTENTdm </li></ul></ul>
    53. 58. Digital Preservation <ul><ul><li>Digital preservation is the process of ensuring that you have </li></ul></ul><ul><ul><li>long-term access </li></ul></ul><ul><ul><li>to your digital materials </li></ul></ul><ul><ul><li>Digital Preservation ≠ Digitization </li></ul></ul>
    54. 59. Born-digital resources <ul><ul><li>Files that are created natively on electronic devises, such as computers, cell phones, digital cameras, and digital audio recorders </li></ul></ul>
    55. 60. Digitized resources <ul><ul><li>Analog objects that are transferred to a digital format through some conversion process. </li></ul></ul><ul><ul><ul><li>Paper documents/printed books </li></ul></ul></ul><ul><ul><ul><li>Photographic materials like slides, prints, or glass-plate </li></ul></ul></ul><ul><ul><ul><li>3D objects </li></ul></ul></ul><ul><ul><ul><li>Audio such as cassette tape and LPs </li></ul></ul></ul><ul><ul><ul><li>Film and other moving images </li></ul></ul></ul>
    56. 61. Multiple parts to preserve
    57. 62. NASA loses big. <ul><ul><li>Lunar Orbiter program of 1966 and 1967 </li></ul></ul><ul><ul><li>Mission: to map the entire surface of the moon in preparation for the Apollo landings -- and all five performed magnificently. </li></ul></ul><ul><ul><li>Lunar Orbiter 1 took the first pictures of Earth as a full planet. </li></ul></ul>
    58. 64. NASA loses big. <ul><ul><li>Nancy Evans, a NASA archivist, began collecting the technology in the late-1980s. </li></ul></ul><ul><ul><ul><li>Grabbed the analog tapes and saved imaging hardware from government surplus </li></ul></ul></ul><ul><ul><ul><li>After she retired she stored everything, shrink wrapped and on wooden pallets, in her garage </li></ul></ul></ul>
    59. 65. NASA loses big. Almost. <ul><ul><li>In 2007, two engineers became interested in the project: </li></ul></ul><ul><ul><ul><li>Hired technicians out of retirement </li></ul></ul></ul><ul><ul><ul><li>Located some of the documentation </li></ul></ul></ul><ul><ul><li>Working out of a converted McDonalds near Ames Air force Base in California, they have extracted some of the best quality images of the moon available </li></ul></ul>
    60. 66. Why the story? <ul><ul><li>To scare you into caring even more . . . </li></ul></ul><ul><ul><li>To give you ammunition when advocating for digital preservation in your institution </li></ul></ul><ul><ul><ul><li>(because it isn’t sexy, like those fancy digital collections) </li></ul></ul></ul><ul><ul><li>Just remind yourself, someday they’ll thank you. </li></ul></ul>
    61. 67. <ul><ul><li>“ I’ve digitized this stuff … now how do I preserve it?” </li></ul></ul>With digital preservation, this might not apply.
    62. 68. Look out...
    63. 69. Digital object lifecycle
    64. 70. Concerns: File Formats <ul><ul><li>- Obsolescence can affect </li></ul></ul><ul><ul><ul><li>Media </li></ul></ul></ul><ul><ul><ul><li>Software </li></ul></ul></ul><ul><ul><li>Examples: WordStar, AmiPro, Visicalc </li></ul></ul>
    65. 71. Concerns: File Formats <ul><ul><li>- Proprietary vs. Open Source Software </li></ul></ul><ul><ul><ul><li>Proprietary: code is locked down </li></ul></ul></ul><ul><ul><ul><li>Open source: code is available for viewing and manipulation by all </li></ul></ul></ul><ul><ul><ul><li>Escrowed source: source code is held by third party, should company ever cease </li></ul></ul></ul>
    66. 72. File Format Best Practices <ul><ul><li>Use or save in open formats </li></ul></ul><ul><ul><ul><li>Images: PNG, JPEG </li></ul></ul></ul><ul><ul><ul><li>Text: ASCII, Open Office, PDF, XML </li></ul></ul></ul><ul><ul><ul><li>Structured data: CSV </li></ul></ul></ul><ul><ul><li>Keep an eye on old file formats/media </li></ul></ul><ul><ul><ul><li>May need to re-save as a newer version before your software is upgraded or your media can’t be read </li></ul></ul></ul>
    67. 73. Concerns: Context <ul><ul><li>Metadata </li></ul></ul><ul><ul><ul><li>External: Information from the creator </li></ul></ul></ul><ul><ul><ul><ul><li>Could go in a manifest, a .txt file in the same folder </li></ul></ul></ul></ul><ul><ul><ul><li>Internal: File header </li></ul></ul></ul><ul><ul><ul><ul><li>Could mean using the “properties” options available in many software programs </li></ul></ul></ul></ul><ul><ul><li>File names </li></ul></ul><ul><ul><ul><li>“ Intelligent,” consistent identifiers </li></ul></ul></ul>
    68. 74. Concerns: Context <ul><ul><li>Original item? </li></ul></ul><ul><ul><ul><li>Is it all? Part? </li></ul></ul></ul><ul><ul><ul><li>Copyright? </li></ul></ul></ul><ul><ul><li>Creation date? </li></ul></ul><ul><ul><li>Creation equipment? </li></ul></ul><ul><ul><li>Creator? </li></ul></ul><ul><ul><li>Manipulated? </li></ul></ul>
    69. 75. Context Best Practices <ul><ul><li>Get the metadata up front </li></ul></ul><ul><ul><li>Keep the metadata with the file (or refer back from an alternate location) </li></ul></ul><ul><ul><li>Name file names intelligently and consistently, using a standard </li></ul></ul><ul><ul><ul><li>For human eyes </li></ul></ul></ul><ul><ul><ul><li>For computers, too </li></ul></ul></ul>
    70. 76. Concerns: Storage <ul><ul><li>Special constraints at your institution </li></ul></ul><ul><ul><li>Access restrictions </li></ul></ul><ul><ul><li>Disaster! </li></ul></ul><ul><ul><li>Staff turnover </li></ul></ul><ul><ul><li>Human error </li></ul></ul>
    71. 77. Storage Best Practices <ul><ul><li>Recognize that IT is often interested in BACKUPs not PRESERVATION </li></ul></ul><ul><ul><li>Keep your important items in multiple locations </li></ul></ul><ul><ul><li>Skip the CDs – use external hard drives if possible </li></ul></ul><ul><ul><li>Keep in mind who has access </li></ul></ul><ul><ul><li>Metadata will help with storage </li></ul></ul>
    72. 78. Digital Preservation <ul><li>There are lots of resources to help you </li></ul><ul><ul><li>Digital Information Management Program, State Library of North Carolina </li></ul></ul><ul><ul><li>Other online resources – see handout and Digital Preservation Section  </li></ul></ul>
    73. 79. North Carolina Digital Heritage Center <ul><li>Current Projects </li></ul><ul><ul><li>Images of North Carolina </li></ul></ul><ul><ul><li>College and University Yearbooks </li></ul></ul><ul><ul><li>North Carolina Memory </li></ul></ul><ul><ul><li>North Carolina Newspapers </li></ul></ul>
    74. 80. North Carolina Digital Heritage Center <ul><li>Who Can Participate? </li></ul><ul><ul><li>Any cultural heritage institution in North Carolina that holds collections that are open to the public. </li></ul></ul><ul><ul><li>Participants to date have included public and private college and universities, community colleges, public libraries, private libraries, and museums. </li></ul></ul>
    75. 81. North Carolina Digital Heritage Center <ul><li>Services Provided </li></ul><ul><ul><li>Digital Publishing </li></ul></ul><ul><ul><ul><li>materials are published on, where they can be searched and discovered alongside collections from other institutions around the state </li></ul></ul></ul><ul><ul><li>Digitization </li></ul></ul><ul><ul><ul><li>the Digital Production Center at UNC-Chapel Hill supports the NC Digital Heritage Center with a wide variety of services </li></ul></ul></ul><ul><ul><li>Project Planning and Consulting </li></ul></ul><ul><ul><ul><li>staff members are available for consultation whether you’re working with the NC Digital Heritage Center or not </li></ul></ul></ul>
    76. 86. North Carolina Digital Heritage Center <ul><li>How to Get Involved </li></ul><ul><ul><li>Review the Contributors Manual </li></ul></ul><ul><ul><li>Contact us to discuss project ideas: </li></ul></ul><ul><ul><ul><li>Nicholas Graham [email_address] / (919) 962-4836 </li></ul></ul></ul><ul><ul><ul><li>Maggie Dickson [email_address] / (919) 962-4836 </li></ul></ul></ul>