0
A Digitization Primer forBotanical and Horticultural Librarians• Chris Freeland  – MBG Web and Digitization Project   Coor...
Why Digitize?• Makes resources broadly available while  preserving original.• 24/7 worldwide availability.• Capitalize on ...
Survey Summary13 Humble Responses!   – Little to no experience with projects   – Some with Scanning/Photoshop• Types of ma...
What we will be covering  •   Audience and Users  •   Goals  •   Ownership  •   Preservation  •   Access  •   Metadata  • ...
A Framework of Guidance for Building            Good Digital Collectionshttp://www.imls.gov/pubs/forumframework.htm•   Int...
Audience and Users• Who are your users  – Today  – Future•   Lifelong Learners•   Scholar/researcher•   Students•   Busine...
Why is it important to define              users?• Guide selection process• Determines complexity and type of  metadata• D...
How can you retain users and    keep them coming back?• Keep adding new content• Creating value-added content after the  i...
User Comments• Should include a way to solicit, retain, and  respond to user comments and suggestions.  – Can tell you if ...
Planning and Goals• Have clear project goals and objectives• Be aware that funding agencies may influence  the scope of yo...
Ownership• Copyright needs to be considered• Holding doesn’t mean owning• Is item in public domain?    http://www.unc.edu/...
Selection• Audience needs• Good Collections• Condition• One or many collections or mainstreaming• Item formats and sizes• ...
Preservation and Digitization• Digitization is NOT preservation• Do not discard originals.• Why not?  – Media longevity  –...
Preserving the Original• Handle Items Once (Scan high!)• Consider rehousing either before or  after scanning.• Appropriate...
Discovery and Access            (or Scanned and Deliver)• Online Catalog or Database  – Subject Heading or keyword search•...
Web Access and Display• Exhibit Approach• Database Approach• Both              CBHL 2002: A Digitization Primer
Exhibit Approach• Pull together text, images, maps,  documents, etc. to tell a story• Value added information enhances the...
Example of Exhibit Approach• Private  Passions,  Public  Legacy:  Paul  Mellons  Personal  Library at  the  University  of...
Database Approach• Give access to images through a search  mechanism  – Generally have to know something about    the coll...
Example of Database Approach• Making of  America• Google Image  Search                 CBHL 2002: A Digitization Primer
Both Approaches• Provide value added information to  reach a wider audience• Also give full access to the data for  people...
Example – MBG Rare Book Site         CBHL 2002: A Digitization Primer
Design vs. Development• Usually spend too much time discussing  background colors and layout  – Too subjective• Should foc...
“If you build it, they may not come”• Indexing by search engines is not a  given• Great images + great metadata does  not ...
Indexing tips CBHL 2002: A Digitization Primer
Indexing tips – <meta> tag•   <meta name="description" content="The Missouri Botanical Garden Library presents its Rare Bo...
Indexing tips - <title> tag• Use descriptive <title> tags:  – <title>MBG Rare Books: Plate 1 -    Cinchona officinalis</ti...
Indexing tips - <body> text• Use text in your page:  – A Description of the Genus Cinchona by    Lambert, Aylmer Bourke  –...
More indexing tips• Having key phrase in all 3 (<meta>, <title>,  and body text) increases your search engine  rank• Index...
Case Study:    Köhler’s Medizinal Pflanzen• Published 1883 –  1914• Digitized in 1997• Images were heavily  edited and cro...
Case Study:   Köhler’s Medizinal Pflanzen• Created static HTML pages with links  through site• Created a list of current b...
Case Study:   Köhler’s Medizinal Pflanzen• Receive more user feedback and image  requests for this site than any other• Re...
Lessons learned• DON’T:  – spend too much time bickering over    color schemes, fonts, and layout  – confuse users and ind...
Lessons learned• DO:  – spend time creating rich <meta> and    <title> tags and body text  – Learn how search engines inde...
Metadata and Electronic            Resources• Vast amount of information, increasing at a  faster rate than is manageable•...
Metadata and Standards• Metadata definition: data about data; data  that aids in identification, description and  location...
Dublin Core Basics• http://purl.oclc.org/dc/• How it began• Why it is important   –   Simple to create   –   Easy to under...
Dublin Core Elements•   Title                           • Subject•   Creator                           terms/classificatio...
How MBG uses DC for a book• Title: Icones pictae plantarum rariorum descriptionibus et  observationibus illustratae / Auct...
How MBG uses DC for a page/image• Title: QK495F270L351797_0060.jpg• Creator: Lambert, Aylmer Bourke, 1761-1842 Subject:  C...
Subject Access• Controlled vocabularies  – Vocabularies and thesauri  – Taxonomies  – Access               CBHL 2002: A Di...
XML• METADATA  – descriptive      – facilitate discovery          •   OAI          •   MARC          •   EAD          •   ...
XML• METADATA cont.  – structural     – storage/presentation of digital object(s)          • METS (metadata encoding and t...
XML• SGML/HTML/XML   – Standard Generalized Markup Language (1986)   – Hypertext Markup Language (1989)   – eXtensible Mar...
XML• XML cont. -best of both worlds   – storage      • can store any kind of structured info/not limited        to Web del...
XML• XML is a lot simpler than SGML and is sometimes  described as an 80/20 solution: you get 80% of the  power of SGML fo...
XML• XML and NYBG digitization project                                      XML text             Images                   ...
XML• XML/NYBG project  – lack of adopted standards  – nature of the data  – delivery mechanisms• Research!                ...
XML• XML sites   –   http://www.oasis-open.org/cover/sgml-xml.html   –   http://www.w3.org/XML/   –   http://www.ucc.ie/xm...
Scanning• Principles for  Scanning• Access (not  preservation)• Storage• Outsource options                CBHL 2002: A Dig...
Howard Besser’s Principles• Scan at the highest resolution appropriate to  the informational content of the originals• Sca...
Besser’s Principles Cont.• Use image file formats and compression  techniques that conform to industry standards• Create b...
Scan Basics• Digital formats—Master/Archival, access,  thumbnail• Always keep a facsimile master• Minimum recommended stan...
MBG Imaging Lab Specs• See handout                CBHL 2002: A Digitization Primer
Scanning• Your requirements may be different  than the accepted norm  – Maybe 600 dpi is too low for your    project• Shou...
Scanning Guidelines• Review handout             CBHL 2002: A Digitization Primer
Scanning• Software—Scanners come with some  basic software, Adobe Photoshop Lite• Keep current on software• Physical facil...
Outsourcing• What?  – Contract work to service providers  – Off-site, on-site, imaging only, image/content    display/mana...
Outsourcing• Why? Cont.    • staff expertise    • available resources (funding for staff training and      equipment, phys...
Outsourcing• NYBG/Mellon Digitization Project  – 3 titles from RB collection  – conservation efforts necessary  – 21 month...
Outsourcing• Weighing the pros and cons  – fragile/rare materials under supervised control    vs. equipment costs and    u...
Outsourcing• Vendors  – Octavo http://www.octavo.com/  – Systems Integration Group   http://www.sigi.com/  – Preservation ...
Sustainability• Digitization shouldn’t be a fling, (when  others are paying the bills) It is a  marriage and more.• Time =...
Cost• Not cheap, but consider the value of objects,  the investment already made on your  collections and your organizatio...
Staffing• Staff with tolerance for ambiguity• Staff with creativity• Training in metadata, scanning• Photographic skills (...
Concluding Thoughts•   Create digital products worth preserving•   Collaborate!•   Adhere to standards•   Refresh/migrate ...
Upcoming SlideShare
Loading in...5
×

A Digitization Primer for Botanical and Horticultural Librarians

571

Published on

CBHL 2002, San Francisco

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
571
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Introductions Disclaimers – Not on “Guru Tour” of digitization workshops Share what we have learned.
  • Ask people why they want to digitize Call on or read those that replied Tell how Raven was asking why if the Art museums library was digitized, why isn’t ours? Cataloging vs. digitizing. We are really talking about “Reformatting” as opposed to items born digital.
  • Share what we have learned. Everything is scaleable. You are not LC Focusing on projects, but can be 50 or 5000 or 50000 images. Even if you are scanning only for in house use, not for web delivery much of this is relevant. As complicated as this seems, it is continually getting more standardized and easier. Many more resources and standards available A Framework of Guidance for Building Good Digital Collections
  • Distinguish between a enduring value and immediate value scanning. Importance of thinking of issues bigger than “scanning” especially when using funding from large agencies. Lots of people around the world digitizing. Standards, Interoperbility More Bang for Buck. Reuse and repurposing of digitized items. Think big and think about the future Includes Sources for detailed information on Good selection Sources for detailed information on creating Good Digital Objects Souces for detailed information on creating good Metadata Sources for detailed information on running Good Project
  • See matrix Appendix 1 Predicting users is difficult.
  • Allow for major staffing, hardware or software delays Project manager who is accountable and empowered Who do you need to talk to about server space, databases, programming? Documentation for others and your own institution. Reports for granting agencies. Make an estimate---Double time, triple expense. Cost for scanning 500 page rare book $3000 staff time and media. No equipment or indirect costs.
  • Mr. Sid controls access to high resolution images
  • Selection can be based on Themes Geography Historic Period Subject Headings Core Lists! Material Types (images) Don’t give exact localities of rare and endangered plants Don’t scan personal or sensitive information about founders or their descendents.
  • Will address preservation of digital objects later.
  • “Guided tour” of images Browse by Subject could be thought of as an exhibit approach
  • No search function. Purpose is to tell the story about Paul Mellon, not document his collection in its entirety.
  • Give MBG Archives project at example.
  • No search feature No thought given to future books, format, etc.
  • What’s different about electronic resources
  • Some history and standards
  • We want to provide guidance as well as guidelines
  • Sho Think through long term commitment next month, 5 years, 50 years 100 years? Cannot put a digital collection on the shelf for 50 years Storing archival images on CD or Servers Documentation for future archaeologists, or the project manager who takes over when you move to a new job. Good metadata will contain some documentation
  • Transcript of "A Digitization Primer for Botanical and Horticultural Librarians"

    1. 1. A Digitization Primer forBotanical and Horticultural Librarians• Chris Freeland – MBG Web and Digitization Project Coordinator• Doug Holland – MBG Administrative Librarian• Heather Rolen – NYBG Digitization Specialist CBHL 2002: A Digitization Primer
    2. 2. Why Digitize?• Makes resources broadly available while preserving original.• 24/7 worldwide availability.• Capitalize on investment in resources and technology (Collections, storage curation)• Assimilate disparate resources• Learn something new (It’s Fun!!)• Pressure from above (Everyone is doing it!) CBHL 2002: A Digitization Primer
    3. 3. Survey Summary13 Humble Responses! – Little to no experience with projects – Some with Scanning/Photoshop• Types of materials – Slides and glass plates 6 – Photos (Electrophoresis gels?) 7 – Printed material [loose, bound (rare books!)], newspaper clippings, maps, architectural drawings, seed and nursery catalogs] 10 – Herbarium Specimens 2• Inhouse image database (Annie Malley) CBHL 2002: A Digitization Primer
    4. 4. What we will be covering • Audience and Users • Goals • Ownership • Preservation • Access • Metadata • Scanning • Sustainability CBHL 2002: A Digitization Primer
    5. 5. A Framework of Guidance for Building Good Digital Collectionshttp://www.imls.gov/pubs/forumframework.htm• Interoperability• Reusability (Repurposing)• Persistence• Verification• Documentation• Respecting copyright and intellectual property law• Think a little bigger and think about the future. CBHL 2002: A Digitization Primer
    6. 6. Audience and Users• Who are your users – Today – Future• Lifelong Learners• Scholar/researcher• Students• Business Community CBHL 2002: A Digitization Primer
    7. 7. Why is it important to define users?• Guide selection process• Determines complexity and type of metadata• Determines image resolution• Determines web-site design (Database or exhibit format)• Determines equipment needs CBHL 2002: A Digitization Primer
    8. 8. How can you retain users and keep them coming back?• Keep adding new content• Creating value-added content after the initial rollout – Lesson plans, etc.• Create an e-mail newsletter CBHL 2002: A Digitization Primer
    9. 9. User Comments• Should include a way to solicit, retain, and respond to user comments and suggestions. – Can tell you if you’re reaching your intended audience – Can provide you with wonderful comments to include in grant proposals or to show your administration: • “Thanks so much for sharing this. This is the internet at its best.” • “This is fantastic. I am most enjoying these rare books, especially the illustrations. I hope to use this with teachers in the future.” CBHL 2002: A Digitization Primer
    10. 10. Planning and Goals• Have clear project goals and objectives• Be aware that funding agencies may influence the scope of your project• Designate a project manager.• Identify key departments or staff• Stay realistic (perhaps conservative) in your production promises.• Document all changes and evolution in your project. CBHL 2002: A Digitization Primer
    11. 11. Ownership• Copyright needs to be considered• Holding doesn’t mean owning• Is item in public domain? http://www.unc.edu/~unclng/public-d.htm http://cidc.library.cornell.edu/copyright/• Modify your deed of gift to include digital distribution• Controlling intellectual property after digitization CBHL 2002: A Digitization Primer
    12. 12. Selection• Audience needs• Good Collections• Condition• One or many collections or mainstreaming• Item formats and sizes• Metadata available or Collection condition (Activities other than scanning require 75% or project time)• Rights• Sensitive Issues (Skeletons??)• Who else is doing the same or similar items? CBHL 2002: A Digitization Primer
    13. 13. Preservation and Digitization• Digitization is NOT preservation• Do not discard originals.• Why not? – Media longevity – Software and hardware obsolescence• Digitization does preserve original through reduced exposure and handling. CBHL 2002: A Digitization Primer
    14. 14. Preserving the Original• Handle Items Once (Scan high!)• Consider rehousing either before or after scanning.• Appropriate long term storage• Remember 2/3 of project time has nothing to do with scanning. CBHL 2002: A Digitization Primer
    15. 15. Discovery and Access (or Scanned and Deliver)• Online Catalog or Database – Subject Heading or keyword search• Finding Aids for archival collections• Exhibit style educational page• Don’t forget metatags and visibility to Web search engines. (If that is one of your goals!) CBHL 2002: A Digitization Primer
    16. 16. Web Access and Display• Exhibit Approach• Database Approach• Both CBHL 2002: A Digitization Primer
    17. 17. Exhibit Approach• Pull together text, images, maps, documents, etc. to tell a story• Value added information enhances the scanned images• Appealing to a wide audience CBHL 2002: A Digitization Primer
    18. 18. Example of Exhibit Approach• Private Passions, Public Legacy: Paul Mellons Personal Library at the University of Virginia CBHL 2002: A Digitization Primer
    19. 19. Database Approach• Give access to images through a search mechanism – Generally have to know something about the collection to find what you’re looking for• Appealing to a more focused audience – Scholars, professionals CBHL 2002: A Digitization Primer
    20. 20. Example of Database Approach• Making of America• Google Image Search CBHL 2002: A Digitization Primer
    21. 21. Both Approaches• Provide value added information to reach a wider audience• Also give full access to the data for people who know what they want to view. CBHL 2002: A Digitization Primer
    22. 22. Example – MBG Rare Book Site CBHL 2002: A Digitization Primer
    23. 23. Design vs. Development• Usually spend too much time discussing background colors and layout – Too subjective• Should focus on – Search engine placement – Successful searches for key phrases – Usage statistics CBHL 2002: A Digitization Primer
    24. 24. “If you build it, they may not come”• Indexing by search engines is not a given• Great images + great metadata does not equal a popular site• You must consider how search engines work CBHL 2002: A Digitization Primer
    25. 25. Indexing tips CBHL 2002: A Digitization Primer
    26. 26. Indexing tips – <meta> tag• <meta name="description" content="The Missouri Botanical Garden Library presents its Rare Book Digitization Project.">• <meta name="keywords" content="botanical illustration,rare books, herbals, engravings, illustrations, botany, botanical illustrations, medicinal plants, Desktop Wallpaper, images of medicinal plants, plant images, Jaume, Kohler">• <META NAME="DC.Title" CONTENT="Plate 1 - Cinchona officinalis; <i>Cinchona officinalis</i> L.; quinine">• <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#title">• <META NAME="DC.Creator" CONTENT="Lambert, Aylmer Bourke">• <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#creator">• <META NAME="DC.Subject" CONTENT="(SCHEME=LCSH) Cinchona.|Hyaenanche.|Rubiaceae.|Euphorbiaceae.|Graphic media : --Copper engraving -- Uncolored -- 1797 -- England.|">• <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#subject">• <META NAME="DC.Subject" CONTENT="(SCHEME=LCCS) QK495 .F270 L35 1797">• <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#subject">• <META NAME="DC.Description" CONTENT="Plate 1 - Cinchona officinalis; <i>Cinchona officinalis</i> L.; quinine">• <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#description">• <META NAME="DC.Publisher" CONTENT="Missouri Botanical Garden">• <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#publisher">• <META NAME="DC.Contributor.CorporateName" CONTENT="Missouri Botanical Garden">• <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#contributor">• <META NAME="DC.Date" CONTENT="(SCHEME=ISO8601)1998-10-01">• <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#date">• <META NAME="DC.Type" CONTENT="Image.Illustration">• <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#type">• <META NAME="DC.Format" CONTENT="(SCHEME=IMT) text/html">• <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#format">• <LINK REL=SCHEMA.imt HREF="http://sunsite.auc.dk/RFC/rfc/rfc2046.html">• <META NAME="DC.Identifier" CONTENT="http://ridgwaydb.mobot.org/mobot/rarebooks? referencenumber=QK495F270L351797">• <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#identifier">• <META NAME="DC.Language" CONTENT="(SCHEME=ISO639-1) en">• <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#language">• <META NAME="DC.Relation" CONTENT="QK495F270L351797">• <LINK REL=SCHEMA.dc HREF="http://purl.org/metadata/dublin_core_elements#relation"> CBHL 2002: A Digitization Primer
    27. 27. Indexing tips - <title> tag• Use descriptive <title> tags: – <title>MBG Rare Books: Plate 1 - Cinchona officinalis</title> CBHL 2002: A Digitization Primer
    28. 28. Indexing tips - <body> text• Use text in your page: – A Description of the Genus Cinchona by Lambert, Aylmer Bourke – Description of Page: Plate 1 - Cinchona officinalis (Cinchona officinalis L., quinine) CBHL 2002: A Digitization Primer
    29. 29. More indexing tips• Having key phrase in all 3 (<meta>, <title>, and body text) increases your search engine rank• Indexing robots follow links on pages – They will follow the hierarchy of your site• Robots don’t: – Click on buttons – Use dropdown menus – Natively navigate or index Flash/multimedia content CBHL 2002: A Digitization Primer
    30. 30. Case Study: Köhler’s Medizinal Pflanzen• Published 1883 – 1914• Digitized in 1997• Images were heavily edited and cropped• Text was added to images CBHL 2002: A Digitization Primer
    31. 31. Case Study: Köhler’s Medizinal Pflanzen• Created static HTML pages with links through site• Created a list of current botanical names with links to illustration• NOT technically sophisticated• Used an Exhibit Approach CBHL 2002: A Digitization Primer
    32. 32. Case Study: Köhler’s Medizinal Pflanzen• Receive more user feedback and image requests for this site than any other• Reasons: – Popular content with interesting images – Has been online for several years – Simple web display that can be indexed by all search engines CBHL 2002: A Digitization Primer
    33. 33. Lessons learned• DON’T: – spend too much time bickering over color schemes, fonts, and layout – confuse users and indexing robots with irregular navigation – ignore importance of search engine results for your content CBHL 2002: A Digitization Primer
    34. 34. Lessons learned• DO: – spend time creating rich <meta> and <title> tags and body text – Learn how search engines index content – Consider display, but focus on development CBHL 2002: A Digitization Primer
    35. 35. Metadata and Electronic Resources• Vast amount of information, increasing at a faster rate than is manageable• Standards developing and evolving, using best practices• Web enabled search engines—many, varied in retrieval success• Everyone’s a publisher, everyone’s a librarian• HTML Metatags structure and content limited, inhibits reliable searching• Lack of subject rich terms CBHL 2002: A Digitization Primer
    36. 36. Metadata and Standards• Metadata definition: data about data; data that aids in identification, description and location of networked resources• Standard Generalized Mark-up Language (SGML)--1986 – Structure for producing documents – Document Type Definition (DTD) created for each type of material or individual publication – SGML’s support of encoding text AND description of document in the header CBHL 2002: A Digitization Primer
    37. 37. Dublin Core Basics• http://purl.oclc.org/dc/• How it began• Why it is important – Simple to create – Easy to understand – International – Flexible• Descriptive, Structural and Administrative metadata• All elements repeatable, all optional CBHL 2002: A Digitization Primer
    38. 38. Dublin Core Elements• Title • Subject• Creator terms/classification• Publisher • Rights Management• Contributor • Source• Description • Type• Identifier • Language• Date • Relation• Format • Coverage CBHL 2002: A Digitization Primer
    39. 39. How MBG uses DC for a book• Title: Icones pictae plantarum rariorum descriptionibus et observationibus illustratae / Auctore J.E. Smith, M.D. Fasc. 1-3.• Creator: Smith, James Edward• Subject_LCSH: Botany -- Pictorial works.• Subject_LCCS: QK98 .S657• Description: 2 p.l., 18 numb. 1. : 18 col. pl. ; 50 cm.• Publisher: London, 1790-93, Missouri Botanical Garden• Contributor: Photography and Web design by Debbie Windus.• Date: 1998-09-01• Identifier: http://ridgwaydb.mobot.org/mobot/rarebooks/title.asp? relation=QK98S657• Relation: QK98S657• Rights: http://ridgwaydb.mobot.org/mobot/rarebooks/copyright.asp CBHL 2002: A Digitization Primer
    40. 40. How MBG uses DC for a page/image• Title: QK495F270L351797_0060.jpg• Creator: Lambert, Aylmer Bourke, 1761-1842 Subject: Cinchona.|Hyaenanche.|Rubiaceae.|Euphorbiaceae.| Graphic media : --Copper engraving -- Uncolored -- 1797 -- England.|• Description: Plate 9 - Cinchona angustifolia• Publisher: Missouri Botanical Garden• Contributor: Missouri Botanical Garden• Date: 1998-10-01• Type: Image• Format: jpeg• Identifier: 0060• Source: QK495.F270 L35 1797 CBHL 2002: A Digitization Primer
    41. 41. Subject Access• Controlled vocabularies – Vocabularies and thesauri – Taxonomies – Access CBHL 2002: A Digitization Primer
    42. 42. XML• METADATA – descriptive – facilitate discovery • OAI • MARC • EAD • Dublin Core – administrative – identify/manage/preserve digital object(s) over time • info on where pieces reside • info on how to view digital object • info on scanning process CBHL 2002: A Digitization Primer
    43. 43. XML• METADATA cont. – structural – storage/presentation of digital object(s) • METS (metadata encoding and transmission standard) » http://www.loc.gov/standards/mets • TEI (text encoding initiative) http://www.tei-c.org • TEI for Libraries (5 levels of encoding) • http://www.indiana.edu/~letrs/tei/ • METAe -automatic metadata creation • http://meta-e.uibk.ac.at CBHL 2002: A Digitization Primer
    44. 44. XML• SGML/HTML/XML – Standard Generalized Markup Language (1986) – Hypertext Markup Language (1989) – eXtensible Markup Language (1996)• XML – a document markup language for defining structured information – a language used by computers to define hidden information about the structure of a document CBHL 2002: A Digitization Primer
    45. 45. XML• XML cont. -best of both worlds – storage • can store any kind of structured info/not limited to Web delivery – presentation • flexible development/design CBHL 2002: A Digitization Primer
    46. 46. XML• XML is a lot simpler than SGML and is sometimes described as an 80/20 solution: you get 80% of the power of SGML for 20% of the effort• You can use XML without thinking ahead and make up your elements en route as long as they nest within each other. This is called writing "well-formed" rather than "valid" XML. Purists discourage this but people will do it anyhow.• XML is specifically designed to work easily with the Web. – http://facultyweb.at.nwu.edu/english/mmueller/ariadne/teixintro/ index.htm CBHL 2002: A Digitization Primer
    47. 47. XML• XML and NYBG digitization project XML text Images files Public access GSDL software server suite Public use CBHL 2002: A Digitization Primer
    48. 48. XML• XML/NYBG project – lack of adopted standards – nature of the data – delivery mechanisms• Research! CBHL 2002: A Digitization Primer
    49. 49. XML• XML sites – http://www.oasis-open.org/cover/sgml-xml.html – http://www.w3.org/XML/ – http://www.ucc.ie/xml/#exec – http://www.xml.com/• SGML sites – http://www.oasis-open.org/cover/general.html – http://www.w3.org/MarkUp/SGML/• Listservs – http://sunsite.berkeley.edu/XML4Lib/ – http://www.oasis-open.org/cover/lists.html CBHL 2002: A Digitization Primer
    50. 50. Scanning• Principles for Scanning• Access (not preservation)• Storage• Outsource options CBHL 2002: A Digitization Primer
    51. 51. Howard Besser’s Principles• Scan at the highest resolution appropriate to the informational content of the originals• Scan at an appropriate level of quality to avoid rescanning and re-handling of the originals in the future--scan once• Create and store a master image file that can be used to produce derivative image files and serve a variety of current and future user needs• Use system components that are non- proprietary CBHL 2002: A Digitization Primer
    52. 52. Besser’s Principles Cont.• Use image file formats and compression techniques that conform to industry standards• Create backup copies of all files on a stable medium• Create meaningful metadata for image files or collections• Store media in an appropriate environment• Monitor and recopy data as necessary• Outline a migration strategy for transferring data across generations of technology• Anticipate and plan for future technological developments CBHL 2002: A Digitization Primer
    53. 53. Scan Basics• Digital formats—Master/Archival, access, thumbnail• Always keep a facsimile master• Minimum recommended standards- NARA/LC/CPD• Hardware requirements: – Scanner that exceeds your standards – Workstation—At least Pentium III, 650mhz, storage (20+gigabyte) – Server for display and archiving CBHL 2002: A Digitization Primer
    54. 54. MBG Imaging Lab Specs• See handout CBHL 2002: A Digitization Primer
    55. 55. Scanning• Your requirements may be different than the accepted norm – Maybe 600 dpi is too low for your project• Should be aware of generally accepted guidelines – Have to know the rules before you break them CBHL 2002: A Digitization Primer
    56. 56. Scanning Guidelines• Review handout CBHL 2002: A Digitization Primer
    57. 57. Scanning• Software—Scanners come with some basic software, Adobe Photoshop Lite• Keep current on software• Physical facilities for scanning• When to outsource/special materials CBHL 2002: A Digitization Primer
    58. 58. Outsourcing• What? – Contract work to service providers – Off-site, on-site, imaging only, image/content display/management provider, ASP (application service provider)• Why? – Factors to consider • Project size • project expectations • staff size CBHL 2002: A Digitization Primer
    59. 59. Outsourcing• Why? Cont. • staff expertise • available resources (funding for staff training and equipment, physical space) • deadlines CBHL 2002: A Digitization Primer
    60. 60. Outsourcing• NYBG/Mellon Digitization Project – 3 titles from RB collection – conservation efforts necessary – 21 month grant, no lab, no allocated space to build lab, no staff, no expertise, no extra funding for equipment or staff training, project expectations (grant stipulates archival quality imaging, hard deadline) – image capture outsourced to east coast vendor, quality checks performed in-house CBHL 2002: A Digitization Primer
    61. 61. Outsourcing• Weighing the pros and cons – fragile/rare materials under supervised control vs. equipment costs and updates/staff/expertise/time/ physical space• Worth consideration – …”For digitization projects, institutions and service providers are working with developing technologies and a new vocabulary, creating new quality and production benchmarks, and trying to determine best practices. All the while, digital technology continues to evolve. Both parties must collaborate to determine capture requirements, costs, and deliverables; manage the process; and agree on criteria.” -Meg Bellinger, President, Preservation Resources, Moving Theory into Practice, 2000. CBHL 2002: A Digitization Primer
    62. 62. Outsourcing• Vendors – Octavo http://www.octavo.com/ – Systems Integration Group http://www.sigi.com/ – Preservation Resources http://www.oclc.org/oclc/presres/ – Saztec http://www.saztec.com – Innodata http://www.innodata.com/ – Northern Micrographics http://www.normicro.com/ northern_micrographics.htm CBHL 2002: A Digitization Primer
    63. 63. Sustainability• Digitization shouldn’t be a fling, (when others are paying the bills) It is a marriage and more.• Time = Money• Permanence• Data Migration and Emulation• Review and schedule upgrades• Documentation CBHL 2002: A Digitization Primer
    64. 64. Cost• Not cheap, but consider the value of objects, the investment already made on your collections and your organizational mission .• Prices range from $7 - $35 per image• Most projects are funded on soft money. Attempt to incorporate scanning into normal operating budgets.• Scanning is 1/3 of total cost.• Largest cost is in research and time invested in creation of metadata or organization of collections. CBHL 2002: A Digitization Primer
    65. 65. Staffing• Staff with tolerance for ambiguity• Staff with creativity• Training in metadata, scanning• Photographic skills (artistic eye) microcomputer skills, web design skills• Staff with risk taking attitude CBHL 2002: A Digitization Primer
    66. 66. Concluding Thoughts• Create digital products worth preserving• Collaborate!• Adhere to standards• Refresh/migrate your data• Don’t forget preservation metadata- digital products are not copies, but new artifacts CBHL 2002: A Digitization Primer
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×