Building Digital Collections: Planning and Creating


Published on

Workshop presented at the Wisconsin Conference for Local History and Historic Preservation, Wisconsin Rapids, October 11, 2013. Presenters: Sarah Grimm, Electronic Records Archivist, Wisconsin Historical Society and Emily Pfotenhauer, Recollection Wisconsin Program Manager, WiLS.

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Once you have your selection criteria, it may not be possible to review/select everything at once, so how might you sequence the process? Again, the answer will be different for each organization.Think about what’smost significant to your organization?most extensive? (and therefore a more coherent body of material to manage)most requested/used?Easiest to tackle (e.g. most familiar, most ready for ingest – a quick win for your digital preservation process; very helpful when you are having to prove the value of your efforts to a reluctant administration)Oldest (possible historical importance)Newest(possible immediate interest)Mandated (via local policies, legislation, etc.)At risk? If it were no longer available, what digital files would be the hardest to replace? Some formats become obsolete a lot faster than other formats. PDFs are viable for a really long time – video files, however, get old very quickly.
  • If you answered “no” to any of these questions, the item may not be a good candidate for digitization.
  • Copyright demo
  • As you are going through the selection process, you will need to establish how you are going to name and organize your files. find things in many places and named in many different ways depending on who worked on the item. Digital items are so much easier to save psychologically for people. 100 items on your hard drive doesn’t take up as much visual space as 100 items in your office. A file that is 1 kb looks pretty much like the one that is 1 MB or 1 GB. There also tends to be more copies of digital items, everyone keeps a draft, or it gets attached to an email and sent to 10 people, or it gets filed in two places. Everybody keeps their own items…project documentation is rarely one person managing the group’s information anymore. Its multiplied by the number of people working on the projectAs a result – EVERYTHING IS SAVED – “just in case” and its often saved more then once
  • Standards – Need a baseline so that everyone knows how to name items as well as how NOT to name themOR where and how items will be stored
  • Short and Descriptive – My record is a file name with 167 characters. While really descriptive, it was too hard to work with. Couldn’t read the entire title in a file list and couldn’t copy it since it was buried in several layers of folders. We tend to name things in ways that make sense to us at the time, but this is not handy for long term preservation. You need to name things in a way that will make sense 20 years from now. Has anyone inherited files from previous employees or projects – do they make any sense? “My stuff” “Important” “To Read”
  • Searching is really difficult if you have to search through multiple layersMany types of documents will be easier to find if you can come up with a consistent date naming convention
  • This slide contains links to both the web version and the You Tube version of 4 videos created by the State Library of North Carolina about File Naming procedures. They total about 10 minutes and provide some great tips.
  • Co-locate – It’s OK to move things around if it makes sense to do so. Layers – If you have several layers to hunt through, it can be really hard to find anything – Shallow is better Searching is really difficult if you have to search through multiple layersBreadcrumbs – OK to leave “sticky notes” (AKA “READ ME”) files in folders. Can give a brief description of contents, retention schedule, any naming conventionsDon’t know – unknown file formats, files on old media (floppies), password protected
  • File backups – EX: Speeches had multiple drafts  Final + copies in several different font sizes Supplementary files – folder of images that were used in a power point. Files you can’t open – CorruptedFormats – may receive Word and pdf – May not want to keep both. As you are creating your inventory, you are likely to discover a lot of really simple places you can clean up the files you are reviewing. Co-locate – It’s OK to move things around if it makes sense to do so. Bury – If you have several layers to hunt through, it can be really hard to find anything – Shallow is better
  • Once you’ve decided how you want to handle file naming issues and have made file management decisions – Document itIt doesn’t have to be long….. You can distribute it in your organization – post it on an intranet, place it in a procedures manual WHY – You will not be the only keeper of the information. (You weren’t here to ask)It will help others who may be helping you with the inventoryYou can hand it out to organizations/departments you receive information from In order to better manage our files, we will accept these file types and formats, they will be named this way. Do not give us password protected documentsYou don’t have to organize and fix everything, but you do need to give other people the tools to help you.
  • Talk about handouts
  • Building Digital Collections: Planning and Creating

    2. 2. TODAY’S AGENDA • Introductions • Selecting materials • Selection criteria • Setting priorities • Copyright considerations • Cost considerations • Digitizing collections • Choosing a scanner • Formats and standards • When NOT to scan yourself • Metadata • What is metadata? • Assigning titles and subject headings • Organizing and naming files • Wrap-up and final thoughts Waterford Public Library/University of Wisconsin Digital Collections
    3. 3. INTRODUCTIONS • We are… • Sarah Grimm, Electronic Records Archivist, Wisconsin Historical Society • Emily Pfotenhauer, Recollection Wisconsin Program Manager, WiLS • You are… • What organization do you represent? • What digital projects are you currently working on or thinking about? Eager Free Public Library/University of Wisconsin Digital Collections
    4. 4. WHAT DO YOU MEAN, DIGITIZE? • Selecting materials • Reformatting materials (scanning or photographing) • Adding metadata (descriptive information) • Making available online • Storing and maintaining digital files and data (digital preservation) Wisconsin Historical Society
    5. 5. DEFINING A DIGITAL COLLECTION • A good digital collection… • Is publicly accessible • Is searchable - Includes keywords and other descriptive information (metadata) so users can find what they’re looking for • Uses software that is sustainable (will be around for a long time) and interoperable (can be migrated or shared) • Remains true to the original materials • Respects intellectual property rights • A digital collection is not… • An inventory • An online exhibit/gallery/slideshow
    6. 6. BEFORE YOU EVEN START….. • Don’t scan a mess! Take the time to assess and organize your originals first. • A digital project can be an ideal time to evaluate collection conditions and rehouse materials as needed. • Resources for collections care and organization: • Wisconsin Historical Society Field Services staff • Wisconsin Archives Mentoring Service • National Park Service Conserve-O-Grams Richland County History Room
    7. 7. SELECTING FOR DIGITIZATION Postal workers sorting mail, 1955 Wisconsin Historical Society WHi-36392
    8. 8. TYPES OF MATERIALS • Photographs • Postcards • Letters • Diaries • Scrapbooks • Yearbooks • Newspaper clippings • City directories • Local histories • Magazines • Pamphlets • Maps • Artifacts/3-D objects • Oral histories • Sound recordings • Video recordings • Other? Appleton Public Library
    9. 9. DEVELOPING SELECTION CRITERIA When developing a selection policy, consider… • Your organization’s mission statement and collecting policies • Appeal and interest (is this of value to researchers? To other audiences?) • Uniqueness of materials (is this the only source or does it also exist elsewhere? Avoid duplication) • Focusing on a specific subject, theme or creator • Manageability – tackle a project of appropriate size and scope
    10. 10. SETTING PRIORITIES Ask yourself which materials are… • most significant to your organization? • most extensive? • most requested/used? • easiest? • oldest? • newest? • at risk? Neville Public Museum of Brown County
    11. 11. SELECTION – YES OR NO • This item is rare or unique to our collection. • This item is frequently requested by our patrons/visitors. • This item or very similar items are not found anywhere else on the Internet. • There is enough accurate information available about the item to add useful context for our audience (for example, we know or can find out names of people, locations, dates). • We have the appropriate equipment to create an accurate, high-quality digital copy of this item (for example, item is not too large to fit on scanner), or funding to outsource if needed. • This item is in stable condition and will not be damaged by scanning or other handling. • This item is in the public domain or we have secured permission from the rights holder to make it available online.
    12. 12. CONSIDERING COPYRIGHT • Disclaimer: We are not lawyers. • Owning a physical item does not necessarily mean you hold the copyright to that item. • Public domain = no longer under copyright. In the US in 2013 that means the item was: • Published before 1923 –OR– • Unpublished; creator died before 1943 –OR– • Unpublished; unknown creator; made before 1893 UW-Milwaukee Libraries
    13. 13. CONSIDERING COPYRIGHT • Works under copyright, copyright holder is known: • Contact copyright holder IN WRITING to request permission to make available online. • Works presumed to be under copyright; copyright holder is unknown or cannot be located: • Due diligence has been made to identify and locate copyright holder. • Be prepared to remove item from digital collection if challenged. Three Lakes Historical Society
    14. 14. SAMPLE COPYRIGHT STATEMENTS • For an item presumed to be in the public domain: 
This item is in the public domain. There are no known restrictions on the use of this digital resource. Contact [your institution] to purchase a high- resolution version of this image. • For an item under copyright; copyright holder has granted permission to put online:
This image has been made available with permission of the copyright holder and has been provided here for educational purposes only. Commercial use is prohibited without permission. Contact [your institution] for information regarding permissions and reproductions. • For an item in which copyright status is undetermined:
This material may be protected by copyright law. The user is responsible for all issues of copyright. Contact [your institution] for information regarding permissions and reproductions.
    15. 15. COPYRIGHT TOOLS - DEMO • Public Domain Sherpa: Public Domain Calculator • • Copyright Advisory Network • Copyright Slider: • Copyright Genie:
    16. 16. POTENTIAL PROJECT COSTS • Scanner • Outsourcing imaging to a commercial vendor • Digital camera and related equipment • Internet access • Storage for digital files • Software for online access • Archival storage supplies • Be sure to budget for TIME and SPACE Merrill Historical Society
    17. 17. FUNDING • Grants • Historical societies: WI Council for Local History mini-grants • Public libraries: LSTA Digitization of Local Resources grants (Dep’t of Public Instruction) • Local corporations or foundations • In-kind contributions • Tech support • Equipment use • Biggest expense is TIME • Paid staff time • “Free” volunteer time • Students/interns Ripon College
    18. 18. CREATING DIGITAL IMAGES Computer center, 1972 St. Norbert College
    19. 19. DIGITAL IMAGING • Goals of imaging: • Create a digital representation that’s faithful to the original item • Create the highest quality image you can with available resources • Anticipate multiple uses (online, print publication, exhibit, etc.) • Scan once—don’t expect to return to re-digitize UW-Madison Archives
    20. 20. CHOOSING A SCANNER • Some features to look for: • Transparency unit --for scanning slides and negatives • Size of scanning bed • Image editing software --many new scanners come with Photoshop Elements • Compatible with your computer’s operating system • Is your computer fast enough to process large image files?
    21. 21. SCANNING PHOTOGRAPHS • Scan all photographs in 24-bit color, even if image is black and white • Scanning resolution (ppi) depends on size of original item • Longest side of item longer than 7” = 300ppi • Shorter than 7” = 600ppi • Save two copies of each scan: • High resolution TIFF (20- 40MB) for archiving and printing • Lower resolution JPEG (1-5MB) for online collection, email, social media UW-La Crosse
    22. 22. TIP: USE YOUR HISTOGRAM • A histogram is a graph that shows the distribution of dark and light pixels in a digital image. • Using the Histogram function improves the accuracy/fidelity of your scan • Do a preview scan • In advanced/professional/ custom mode, select the Histogram function • Move the left and right sliders to each end point of the histogram • Do not move the sliders INTO the histogram • Scan the image
    23. 23. TIP: PLACE IMAGES CAREFULLY Leave a border on all four sides OR crop all four sides evenly.
    24. 24. SCANNING DOCUMENTS • Handwritten texts • Scan in 24-bit color to retain character of original • 300-400ppi is generally sufficient • If feasible, create a transcription • Use care when unfolding papers or handling tightly bound volumes Wisconsin Historical Society
    25. 25. SCANNING DOCUMENTS • Printed texts • Scan in 8-bit grayscale or 24-bit color • 300ppi is generally sufficient • Use OCR (Optical Character Recognition) software to make the text computer- searchable • May be provided with your scanner software • ABBYY Fine Reader • Adobe Acrobat • OCR is never 100% accurate, but that’s ok L. E. Phillips Memorial Library, Eau Claire
    26. 26. WORKING WITH PRINTED TEXT? OCR! • OCR = Optical Character Recognition • Software that makes printed text computer-readable and fully searchable • Very valuable when scanning books, yearbooks, city directories, newspaper clippings, etc. • A couple of options… • ABBYY Finereader ($100-$170) • Adobe Acrobat ($45 through
    27. 27. WHEN NOT TO SCAN IT YOURSELF • Look to a vendor for scanning… • Oversized materials --maps, blueprints, etc. • Fragile books or scrapbooks --bindings can be damaged by laying flat to scan • Anything with flaking, cracked or otherwise fragile surface • Microfilm --newspapers • Potential vendors • Northern Micrographics, La Crosse • A/E Graphics, Milwaukee • Wisconsin Historical Society (for microfilm)
    28. 28. METADATA Syl carving his name in tree, 1902 Wisconsin Historical Society WHi-69022
    29. 29. METADATA: WHAT IS IT? • Information about stuff • Technical metadata = information about the digital file (size, type, etc.) • Descriptive metadata = information about the content of the item (what are we looking at?) • Helps users find what they’re looking for • Organized, standardized, consistent, searchable Grant County Historical Society
    30. 30. SAMPLE METADATA Field Name Sample Data Title DiVall barber shop, Middleton, 1925 Subjects Barbers; Barbershops Type Still image Format image/tiff Rights statement This material may be protected by copyright law. The user is responsible for all issues of copyright. File name 2006_01_12.tif Submitter Middleton Area Historical Society Date digitized 2013-04-05 Middleton Area Historical Society
    31. 31. SAMPLE METADATA Field Name Sample Data Creator Bartle, F. C. Date Created 1925-09-12 OR 1920-1930 Materials Photographs Description Ralph DiVall (left) and Edwin T. Baltes (right) shave two men seated in barber chairs. According to a family history on file at the Society, DiVall operated this barber shop from the 1920s until his retirement on July 1, 1966. Location Middleton, Dane County, Wisconsin Collection DiVall Family Collection Identifier 2006.01.12 Middleton Area Historical Society
    32. 32. PHOTOGRAPHS – ASSIGNING TITLES The photograph may already have a title.
    33. 33. EXISTING TITLES If the photograph contains a title or caption, transcribe it exactly. Birds-eye-view, No. 4, 1908, Barneveld, Wis.
    34. 34. WHAT MAKES A GOOD TITLE? If the photo does not already have a title, you’ll need to create one. A useful title is… • Descriptive and specific • Brief • Follows specific formatting rules • Capitalize first word and proper names (people, places, institutions) • Don’t start with “A” or “The” • Period not needed at the end
    35. 35. SUBJECT, LOCATION, DATE Person, object, building, etc. City OR township OR county Year or date range BASIC FORMULA FOR CREATING TITLES Only include an element IF KNOWN
    36. 36. PEOPLE & PORTRAITS • Identify the person’s name (first name, last name) • Identify the location to the most specific level possible (City OR Township OR County) • do not include state • Identify the date (Specific year? Date range?)
    37. 37. Albert Townsend, Clintonville, 1927 (SUBJECT, LOCATION, DATE)
    38. 38. PEOPLE & PORTRAITS • Identify…Who? Where? When? • Women • Children • Babies • Carriages/strollers • Stores/shops • Boardwalk • Marathon County • 1890-1899
    39. 39. Women and children with babies in carriages, Manitowoc County, 1890-1899 (SUBJECT, LOCATION, DATE)
    40. 40. BUILDINGS AND CITYSCAPES • Identify the name of the street or view • Identify the location (City OR Township OR County) • Identify the date (Year? Date range?)
    41. 41. 100 block of South Main Street, Fort Atkinson, 1940-1949 (SUBJECT, LOCATION, DATE)
    42. 42. SUBJECT, ACTIVITY, LOCATION, DATE Person, object, building, etc. City OR township OR county Year or date range EXPANDED FORMULA FOR CREATING TITLES Action or event Only include an element IF KNOWN
    43. 43. Identify…Who? What are they doing? Where? When? • Tailor and customer • Measuring • Two Rivers • Date unknown – 20th century ACTIVITIES AND EVENTS
    44. 44. Tailor measuring man in suit, Two Rivers (SUBJECT, ACTIVITY, DATE)
    45. 45. ACTIVITIES AND EVENTS Identify…Who? What are they doing? Where and when? • Circus elephant • Trainer • Woman on swing • Evansville • 1940-1949
    46. 46. Trainer with circus elephant holding woman on swing, Evansville, 1940-1949 (SUBJECT, ACTIVITY, LOCATION, DATE)
    47. 47. EXERCISE - ASSIGNING TITLES Work in small groups to assign a title to a historic photograph. Remember the basic title formulas: • SUBJECT, LOCATION, DATE • SUBJECT, ACTIVITY, LOCATION, DATE
    48. 48. ASSIGNING SUBJECT HEADINGS • Subject headings are terms or phrases assigned to an item to facilitate searching and browsing a collection. • Consistent use of subject headings helps link related content in your collection and across disparate collections.
    49. 49. CONTROLLED VOCABULARIES • A controlled vocabulary is a standardized, pre-determined list of subject headings. • Some examples of controlled vocabularies: • Library of Congress Thesaurus for Graphic Materials • Library of Congress Subject Headings • Getty Art and Architecture Thesaurus • Nomenclature 3.0 New Berlin Historical Society
    50. 50. TIPS FOR ASSIGNING SUBJECT HEADINGS • Consider the following elements to help select terms: • WHO? People - age, gender, occupation, ethnicity • WHERE? Building or other setting • WHAT? Activities or events • Always copy terms exactly from the controlled vocabulary. • Think of your own “tags,” then search the controlled vocabulary list for correct terms. • How did others do it? Look at similar photos for examples/ideas. • Aim for 1-5 terms. • There is no one right answer!
    52. 52. SAMPLE SUBJECT HEADINGS Railroads; Railroad stations; Carts & wagons
    54. 54. SAMPLE SUBJECT HEADINGS Students; Music education; Youth orchestras
    55. 55. EXERCISE – ASSIGNING SUBJECTS Work in small groups to assign subject headings to a historic photograph (choose a maximum of 5 terms). Select terms from the short list extracted from the Library of Congress Thesaurus for Graphic Materials. The full version of this controlled vocabulary is available online:
    56. 56. FILE NAMING AND ORGANIZATION Sixty Years of Quality Canning by the Lakeside Packing Company, ca. 1947. Manitowoc Public Library/ University of Wisconsin Digital Collections
    57. 57. WHY IS THIS IMPORTANT? • To create organizational standards • To help you find it again • To prevent accidental overwriting • To eliminate (minimize) duplication of files Train Wreck Image ID: WHi-2011
    58. 58. FILE NAMING • Keep folder / document titles short and descriptive • Use only lower case letters, numbers, and dashes or underscores • Don’t use spaces or punctuation • Don’t use special characters in your file/folder titles (^”<>|? / : @’* &.) (Just because you CAN doesn’t mean you SHOULD…..) Typing at Dickinson Secretarial School Image ID: WHi-19562
    59. 59. FILE NAMING • Date your documents consistently • • Use leading zeroes for consecutive numbering. For example, a multi-page letter could have file names mac001.tif, mac002.tif, mac003.tif, etc. • Tie your file names to existing catalog numbers if possible
    60. 60. EXAMPLES • Photograph with accession # 2011.32.1 = 201132001.tif –OR– 2011_32_001.tif • Series of images by photographer John Smith = smith001.tif, smith002.tif, smith003.tif • Not so good: Glassplate16039 Auto repair in basement 025.tif
    61. 61. RESOURCES • State Library of North Carolina – • Web • YouTube
    62. 62. FILE ORGANIZATION AND MANAGEMENT • Centralize your files • Minimize your layers • Leave breadcrumbs (AKA “READ ME”) • Determine what you don’t know IH General Office Mail Room Image ID: WHi-12016
    63. 63. WHAT NOT TO KEEP? • Backups/copies/drafts • Supplementary files that provide no additional long-term value • Corrupted files • Same item – different file formats • Items that don’t fit your organization’s purpose Boy on Curb near Trash Pile Image ID: WHi-57208
    64. 64. DOCUMENT YOUR DECISIONS…. Sinclair Lewis Typing Image ID: WHi-51874
    65. 65. WRAPPING UP – FINAL THOUGHTS Commencement, 1978 UW-Madison Archives
    66. 66. TIPS FROM OTHER DIGITIZERS • If I could do it all over again, I would: • Tackle a smaller group of materials at first • Make sure two people started the project at the same time so we could help each other • Start with a clearer plan • Take the time to sort and research the physical collection before digitizing • Have firm deadlines to help me stay on track Langlade County Historical Society
    67. 67. NEXT STEPS/TO DO LIST • Review collections and set priorities for digitization. • Consider developing a written selection policy. • Determine the copyright status of any materials you plan to share online and secure permissions from copyright holders if materials are not in public domain. • Acquire scanning equipment or make other plans for conversion. • Familiarize yourself with good, useful metadata by looking at other online collections. • Develop a file naming convention document.
    68. 68. THANK YOU! • Sarah Grimm, Wisconsin Historical Society 608-261-1008 • Emily Pfotenhauer, WiLS 608-616-9756 • Slides and handouts available at http://recollectionwisconsin .org/localhistory2013 South Wood County Historical Museum