Planning and Managing Digital Library & Archive Projects
Upcoming SlideShare
Loading in...5

Planning and Managing Digital Library & Archive Projects



Presented at METRO on March 23, 2011

Presented at METRO on March 23, 2011



Total Views
Slideshare-icon Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • Transforming a very old and venerable academic library
  • - Packing it up
  • Into something more interactive, collaborative, and engaging for the students, faculty and staff at the college. If you’ve been there recently you will know what I mean. A big part of it is moving large swaths of functions to digital.
  • Merged institutional repository, digitral library, digital archive Social archive: Institutional repository for current work of faculty, staff, students Web 2.0 design patterns Digitization of archives for prior work - be able to distinguish self-archiving for library archiving efforts
  • Stroke
  • Does not deplete the more it gets used

Planning and Managing Digital Library & Archive Projects Planning and Managing Digital Library & Archive Projects Presentation Transcript

  • Metropolitan New York Library Council ~ March 23, 2011 Dr. Anthony Cocciolo ~ Assistant Professor Pratt Institute ~ School of Information and Library Science
  • Workshop Schedule
    • 10a – 1pm
      • Introduction & Workshop Overview
      • Developing a Strategy for Success
      • Managing Digital Assets: Born-digital and conversion
    • 1pm – 2pm – Lunch!
    • 2 – 4pm
        • Creating an Infrastructure: Technical, Organizational and Resources
      • Evaluating your Project
  • What is a Digital Library?
    • focused collection of digital objects, including text, video, and audio, along with methods for access and retrieval, and for selection, organization, and maintenance of the collection.
        • Witten, Bainbridge and Nichols (2010)
  • Digital Archives
  • Geostoryteller
  • Introductions
    • Name
    • What are you currently up to? (Student, Working as Librarian, Archivist, etc. at X Institution, Looking for work)
    • Why are you interested in this class? (Starting a Digital Library, my boss made me, etc.)
  • Planning & Managing Digital Library & Archive Projects Developing a Strategy for Success
  • Digital Libraries and Archives are Socio-technical systems.
  • Setting an agenda for a Digital Library/Archive Project
    • Trends in Information Use
      • If it’s not easy to get at...
      • Social media, social nature of information
    • Community Needs Assessment
      • Survey, make it representative
      • Focus groups, Interviews
      • Problems with…
    • Use your institution's creativity; hold a design event.
  • Sample Size Calculator
  • Design Event
    • Have someone(s) facilitate the event; be responsible for moving the event forward. Schedule for a 2.5-4 hour event, with working lunch in the middle.
    • Assemble various stakeholders from across the institution. Provide background information .
    • Divide into groups with members of diverse backgrounds
    • Icebreaker activity, warm-up activities (looking at good & bad digital libraries with targeted questions), and design the digital library user experience, using simple materials (markers, etc.)
    • Present out to the group as a whole
  • PocketKnowledge Login | About PocketKnowledge Teachers College, Columbia University ______________________ Search Communities Tags Authors Uploaders Sub Community Money 5 items my pocket | add to pocket | create community pocket | browse all pockets all pockets > money class Money Class (edit) Welcome to the money class, the richest Group of students at TC. PIC XML view: thumbnail | list sort: alphabetical | date | popularity role: all | student | staff | faculty | other
    • Community A
    • 52 items
    • Intersect with
    • View all
    • Community B
    • 32 items
    • Intersect with
    • View all
    • Community C
    • 32 items
    • Intersect with
    • View all
    0 comments RSS Document 1 Firstname Lastname Date
  • A good strategy should…
    • be focused on your users and how it will benefit them.
      • Focus on the needs of the collection, divorced from this factor, could lead you to a product with no users.
      • Grant funders: worst thing is to create something that just sits there (no impact, low use).
    • How will this digital project impact your community?
  • On Strategy
    • What will community members learn from this project? How will you know if they have learned something from your project?
    • Why would someone be intrinsically motivated to use your digital library?
    • How will your project advance specific learning outcomes (class goals), or more general learning outcomes (critical thinking, illiteracies)?
  • Talking Strategy
    • Get into groups of 4
    • Pick a digital project you have worked on or are hoping to start working on. What is your strategy for success?
      • Who is your community? How will it impact your community? What will individuals learn from using it? Why is it an important project? Why do you think your strategy is a good one? How will you know if it is successful?
  • Planning & Managing Digital Library & Archive Projects Managing Digital Assets: Born-digital and conversion
  • Living in a hybrid world
    • Two paradigms:
      • Digitizing artifacts paradigm
        • History / Old Stuff
        • Finite
        • Not something that will go on forever (although to some degree we will always discover old objects; archaeology)
      • Capturing digital material paradigm
      • Bizarre middle ground
  • Born digital
    • Does the person own the material they are giving to you?
      • Is it copyrighted? How about Creative Commons licensing?
    • Terms of use – what will the creator allow you to do with it?
    • Formats- do you have the best copy?
    • Who will create metadata for it?
  • Digital Conversion
    • Can you digitize? Who can you make that digitization available to?
      • Legal
        • Preservation- If it is falling apart (e.g., audio, film)
        • Public Domain – life of author +70 years
        • International Publication, Only make available to your community
        • DMCA
        • Litigious Persons – Dance Project
      • Ethical – LHA project
  • Making Digital Images
    • Create Digital Masters
      • Can create a variety of derivatives from the master for access needs
    • What scanning settings to choose?
      • Use the Cornell approach (using Quality Index)
      • Choose an already developed standard for type of visual media
  • Bitonal: ppi= 3QI/.039h Color/Gray: ppi= 2QI/.039h QI: barely legible (3.0), marginal (3.6), good (5.0), and excellent (8.0); h is height in mm of smallest detail
  • Some problems
    • Would not be a problem if this was a derivative of a digital master.
    • Uses Arial font, not invented until 1982 (1906 document)
    • Lost page numbers
    • Headers and footers? Usually include a bit of citation information.
    • Formatting is not faithful to original
    • Other info? Advertisements?
    • Lose any traces of how this was bound as a book (context it was used). Makes you start to question the authenticity, especially if the PDF gets disconnected from the rest of the collection (e.g., this PDF was “discovered”). Would a historian want to use this?
    • Human Error & Computer error of changing image to digital text
    • CS way of thinking: but all the data is there!
  • Digitizing Audio
    • The minimum:
      • 44.1 kHZ
      • 16-bit
      • Stereo, 2-Channel
      • More info in Sound Directions book (web reference)
  • Metadata
  • DACS EAD MARC Other output formats
  • Computer generated metadata
    • Determining the language of a digital document is very accurate (99+% correct)
  • Most Digital Libraries are run on a CMS
    • The user interface for the database management system (like MySQL), making the DB user-friendly and appropriate for website’s function.
    • Usually a public-side and staff side; varying degrees of control of the CMS.
    • YouTube is a big CMS.
    • A CMS runs on one or more servers.
    • Server
      • Running an OS, such as Linux, MacOSX Server, Windows Server 2008. Dif.
      • Database server: like MySql, Oracle
      • Content Management System: like Omeka, Dspace
      • File System: Containing digital files (.wav, .pdf, etc.)
    Switches and Routers, connected to Internet Service Providers or other Wide Area Networks, Academic Networks Internet (same thing as the other blob below)
  • CMS Infrastructure
    • LAMP
      • Linux – the operating system – like Windows or Mac OS X except good for web servers
      • Apache – the webserver – responses to http requests
        • The Microsoft equivalent is IIS – Internet Information Server. Apache is run mostly on Linux and Mac Server, and occasionally on Windows.
      • MySQL – the relational database management system
      • PHP – the programming language that the CMS is written in
    • Contrast with WAMP, Server vs. Personal Computer
  • Outsourcing
    • Create a detailed projected timeline
      • What date you can expect each deliverable.
      • Don’t let the timeline slip; hold the vendor accountable for the timeline; ask for discounts if slips from timeline
    • Create a detailed budget
      • Itemize each component
  • Handout example
  • Planning & Managing Digital Library & Archive Projects Creating an infrastructure: Technical, Organizational & Resource
  • Hollywood
    • Fewer than half of the feature films before 1950 have survived
      • Less than 20% survive from the 1920s
    • One of the biggest movies of 1954.
    • Nominated for 6 Academy Awards, winner of 2
    • Winner of 2 Golden Globes
  • Archival Masters
    • With the advent of TV and ability to re-broadcast movies on TV, followed by advent of VHS players, Hollywood began to realize that there was a monetary incentive to keep archival masters so the film could be reproduced onto different media (TVs, VHS tape, DVD).
  • Film Preservation
    • “ Film in the Freezer”, “Store and Ignore”
    • Private Vaults
  • Long term access
    • Hollywood: Want to ensure archival masters for at least 100 years
      • Most libraries and archive strive for something like “eternal” access.
  • Challenge
    • There is no hardware and software that can ensure long term access alone; the media will break down anywhere from 5 to 10 years.
    • “ Store and ignore” while concentrating on environmental conditions (like humidity & temperature) will not work.
      • For example, magnetic hard drives cannot be stored on a shelf for longer periods of time. This is because the internal lubrication will be affected by “stiction,” where internal components lock up. Magnetic hard drives should be powered on a spinning. Still have a limited operational lifetime.
  • Doing Digital Preservation
    • Permanence in the digital sense means ongoing and systematic preservation process; an active management approach is required.
    • It is more like maintaining a car, than putting a book on a shelf.
  • Implications (1)
    • That means that the data will be migrated on a schedule
      • Factor migration time (labor), costs in budget and in strategic plans
    • Should be talking in terms of $/TB/year
      • Labor and electricity costs should be factored in, not just media costs
      • Should be including backup and other multiple copies you will be making
    • Example last week was misleading, must always factor in time.
  • Implications (2)
    • Media (CDs, DVDs, Blurays, Gold DVDs), hard drives, on a shelf or under a desk is not good digital archive strategy.
      • If you see this, know that it is bad practice, and work to change it.
    • (Trusted) Digital Repository that is (almost) always powered, redundant, and backed-up is the best strategy.
  • Implications (3)
    • Heavy use is one of the best defenses against digital loss.
      • Patrons will notice if something is amiss.
      • Complete opposite of physical preservation.
  • Managing Digital Content
    • Physical media is almost never an appropriate digital preservation strategy. Most commercial sites aren’t either.
  • Trusted Digital Repository
    • You can make your own Trusted Digital Repository or join a group that has one.
  • Organizational Infrastructure
    • Policy framework
      • Mission statement
    • Financial sustainability/framework (Columbia example)
    • Organizational viability
      • Have a succession plan
  • Technology
    • Redundant hard disks
    • Backup, move to offsite, security
    • Physical security, staff w/security
    • Physical environment (Air conditioning, above 80 deg F, redundant)
    • Electricity (UPS, Backup generator, surve, voltage regulartor), Power is always on.
    • Piggy back on what IT is already doing, if they are doing a enterprise records management system (e.g., Banner, PeopleSoft, Datatel).
  • Evaluating your Project Planning & Managing Digital Library & Archive Projects
  • On Evaluating
    • Evaluation is usually started after something has completed or have had time to be used.
      • Used to inform decisions (replication, discontinuation, refinements, more investment, etc.)
    • Alternative is to do mini-evaluations with user community as you develop.
      • This can be a challenge if you don’t have a user community yet (e.g., have your mom try it out).
    • Evaluation is not the same as usability
  • Evaluation Methods
    • Quantitative: Analysis of numerical data (surveys, logs)
      • Criticized for not getting at what people really think
    • Qualitative: Analysis of words (e.g., interview transcript), pictures, objects
      • Criticized for being biased, not representative
    • Mixed Methods: Depending on decisions that you are trying to make, you may want to triangulate (use multiple methods to get at what you are looking for). Example: Survey, Focus Groups & Transaction Log Analysis. Of course, ability to do all that is budget & time constraints.
  • Sampling
    • Whichever method you use, sampling is important
      • Get a representative sample that accurately represents the entire population
    • Sampling is not important where you capture 100% of the data, such as in transaction log analysis
    • Qualitative Methods
      • You can remove the interpretive bias by using formal qualitative data analysis methods
        • Use independent coders of transcripts to see the extent to which your interpretations coincide.
  • Compare alongside past projects
  • Thank you. Anthony Cocciolo [email_address]