Planning and Managing Digital Library & Archive Projects


Published on

Presented at METRO on March 23, 2011

Published in: Education
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Transforming a very old and venerable academic library
  • - Packing it up
  • Into something more interactive, collaborative, and engaging for the students, faculty and staff at the college. If you’ve been there recently you will know what I mean. A big part of it is moving large swaths of functions to digital.
  • Merged institutional repository, digitral library, digital archive Social archive: Institutional repository for current work of faculty, staff, students Web 2.0 design patterns Digitization of archives for prior work - be able to distinguish self-archiving for library archiving efforts
  • Stroke
  • Does not deplete the more it gets used
  • Planning and Managing Digital Library & Archive Projects

    1. 1. Metropolitan New York Library Council ~ March 23, 2011 Dr. Anthony Cocciolo ~ Assistant Professor Pratt Institute ~ School of Information and Library Science
    2. 2. Workshop Schedule <ul><li>10a – 1pm </li></ul><ul><ul><li>Introduction & Workshop Overview </li></ul></ul><ul><ul><li>Developing a Strategy for Success </li></ul></ul><ul><ul><li>Managing Digital Assets: Born-digital and conversion </li></ul></ul><ul><li>1pm – 2pm – Lunch! </li></ul><ul><li>2 – 4pm </li></ul><ul><ul><ul><li>Creating an Infrastructure: Technical, Organizational and Resources </li></ul></ul></ul><ul><ul><li>Evaluating your Project </li></ul></ul>
    3. 13. What is a Digital Library? <ul><li>focused collection of digital objects, including text, video, and audio, along with methods for access and retrieval, and for selection, organization, and maintenance of the collection. </li></ul><ul><ul><ul><li>Witten, Bainbridge and Nichols (2010) </li></ul></ul></ul>
    4. 17. Digital Archives
    5. 25. Geostoryteller
    6. 31. Introductions <ul><li>Name </li></ul><ul><li>What are you currently up to? (Student, Working as Librarian, Archivist, etc. at X Institution, Looking for work) </li></ul><ul><li>Why are you interested in this class? (Starting a Digital Library, my boss made me, etc.) </li></ul>
    7. 32. Planning & Managing Digital Library & Archive Projects Developing a Strategy for Success
    8. 34. Digital Libraries and Archives are Socio-technical systems.
    9. 43. Setting an agenda for a Digital Library/Archive Project <ul><li>Trends in Information Use </li></ul><ul><ul><li>If it’s not easy to get at... </li></ul></ul><ul><ul><li>Social media, social nature of information </li></ul></ul><ul><li>Community Needs Assessment </li></ul><ul><ul><li>Survey, make it representative </li></ul></ul><ul><ul><li>Focus groups, Interviews </li></ul></ul><ul><ul><li>Problems with… </li></ul></ul><ul><li>Use your institution's creativity; hold a design event. </li></ul>
    10. 44. Sample Size Calculator
    11. 45. Design Event <ul><li>Have someone(s) facilitate the event; be responsible for moving the event forward. Schedule for a 2.5-4 hour event, with working lunch in the middle. </li></ul><ul><li>Assemble various stakeholders from across the institution. Provide background information . </li></ul><ul><li>Divide into groups with members of diverse backgrounds </li></ul><ul><li>Icebreaker activity, warm-up activities (looking at good & bad digital libraries with targeted questions), and design the digital library user experience, using simple materials (markers, etc.) </li></ul><ul><li>Present out to the group as a whole </li></ul>
    12. 50. PocketKnowledge Login | About PocketKnowledge Teachers College, Columbia University ______________________ Search Communities Tags Authors Uploaders Sub Community Money 5 items my pocket | add to pocket | create community pocket | browse all pockets all pockets > money class Money Class (edit) Welcome to the money class, the richest Group of students at TC. PIC XML view: thumbnail | list sort: alphabetical | date | popularity role: all | student | staff | faculty | other <ul><li>Community A </li></ul><ul><li>52 items </li></ul><ul><li>Intersect with </li></ul><ul><li>View all </li></ul><ul><li>Community B </li></ul><ul><li>32 items </li></ul><ul><li>Intersect with </li></ul><ul><li>View all </li></ul><ul><li>Community C </li></ul><ul><li>32 items </li></ul><ul><li>Intersect with </li></ul><ul><li>View all </li></ul>0 comments RSS Document 1 Firstname Lastname Date
    13. 55. A good strategy should… <ul><li>be focused on your users and how it will benefit them. </li></ul><ul><ul><li>Focus on the needs of the collection, divorced from this factor, could lead you to a product with no users. </li></ul></ul><ul><ul><li>Grant funders: worst thing is to create something that just sits there (no impact, low use). </li></ul></ul><ul><li>How will this digital project impact your community? </li></ul>
    14. 56. On Strategy <ul><li>What will community members learn from this project? How will you know if they have learned something from your project? </li></ul><ul><li>Why would someone be intrinsically motivated to use your digital library? </li></ul><ul><li>How will your project advance specific learning outcomes (class goals), or more general learning outcomes (critical thinking, illiteracies)? </li></ul>
    15. 57. Talking Strategy <ul><li>Get into groups of 4 </li></ul><ul><li>Pick a digital project you have worked on or are hoping to start working on. What is your strategy for success? </li></ul><ul><ul><li>Who is your community? How will it impact your community? What will individuals learn from using it? Why is it an important project? Why do you think your strategy is a good one? How will you know if it is successful? </li></ul></ul>
    16. 58. Planning & Managing Digital Library & Archive Projects Managing Digital Assets: Born-digital and conversion
    17. 59. Living in a hybrid world <ul><li>Two paradigms: </li></ul><ul><ul><li>Digitizing artifacts paradigm </li></ul></ul><ul><ul><ul><li>History / Old Stuff </li></ul></ul></ul><ul><ul><ul><li>Finite </li></ul></ul></ul><ul><ul><ul><li>Not something that will go on forever (although to some degree we will always discover old objects; archaeology) </li></ul></ul></ul><ul><ul><li>Capturing digital material paradigm </li></ul></ul><ul><ul><li>Bizarre middle ground </li></ul></ul>
    18. 62. Born digital <ul><li>Does the person own the material they are giving to you? </li></ul><ul><ul><li>Is it copyrighted? How about Creative Commons licensing? </li></ul></ul><ul><li>Terms of use – what will the creator allow you to do with it? </li></ul><ul><li>Formats- do you have the best copy? </li></ul><ul><li>Who will create metadata for it? </li></ul>
    19. 63. Digital Conversion <ul><li>Can you digitize? Who can you make that digitization available to? </li></ul><ul><ul><li>Legal </li></ul></ul><ul><ul><ul><li>Preservation- If it is falling apart (e.g., audio, film) </li></ul></ul></ul><ul><ul><ul><li>Public Domain – life of author +70 years </li></ul></ul></ul><ul><ul><ul><li>International Publication, Only make available to your community </li></ul></ul></ul><ul><ul><ul><li>DMCA </li></ul></ul></ul><ul><ul><ul><li>Litigious Persons – Dance Project </li></ul></ul></ul><ul><ul><li>Ethical – LHA project </li></ul></ul>
    20. 64. Making Digital Images <ul><li>Create Digital Masters </li></ul><ul><ul><li>Can create a variety of derivatives from the master for access needs </li></ul></ul><ul><li>What scanning settings to choose? </li></ul><ul><ul><li>Use the Cornell approach (using Quality Index) </li></ul></ul><ul><ul><li>Choose an already developed standard for type of visual media </li></ul></ul>
    21. 67. Bitonal: ppi= 3QI/.039h Color/Gray: ppi= 2QI/.039h QI: barely legible (3.0), marginal (3.6), good (5.0), and excellent (8.0); h is height in mm of smallest detail
    22. 74. Some problems <ul><li>Would not be a problem if this was a derivative of a digital master. </li></ul><ul><li>Uses Arial font, not invented until 1982 (1906 document) </li></ul><ul><li>Lost page numbers </li></ul><ul><li>Headers and footers? Usually include a bit of citation information. </li></ul><ul><li>Formatting is not faithful to original </li></ul><ul><li>Other info? Advertisements? </li></ul><ul><li>Lose any traces of how this was bound as a book (context it was used). Makes you start to question the authenticity, especially if the PDF gets disconnected from the rest of the collection (e.g., this PDF was “discovered”). Would a historian want to use this? </li></ul><ul><li>Human Error & Computer error of changing image to digital text </li></ul><ul><li>CS way of thinking: but all the data is there! </li></ul>
    23. 78. Digitizing Audio <ul><li>The minimum: </li></ul><ul><ul><li>44.1 kHZ </li></ul></ul><ul><ul><li>16-bit </li></ul></ul><ul><ul><li>Stereo, 2-Channel </li></ul></ul><ul><ul><li>More info in Sound Directions book (web reference) </li></ul></ul>
    24. 79. Metadata
    25. 80. DACS EAD MARC Other output formats
    27. 83. Computer generated metadata <ul><li>Determining the language of a digital document is very accurate (99+% correct) </li></ul>
    28. 84. Most Digital Libraries are run on a CMS <ul><li>The user interface for the database management system (like MySQL), making the DB user-friendly and appropriate for website’s function. </li></ul><ul><li>Usually a public-side and staff side; varying degrees of control of the CMS. </li></ul><ul><li>YouTube is a big CMS. </li></ul><ul><li>A CMS runs on one or more servers. </li></ul>
    29. 85. <ul><li>Server </li></ul><ul><ul><li>Running an OS, such as Linux, MacOSX Server, Windows Server 2008. Dif. </li></ul></ul><ul><ul><li>Database server: like MySql, Oracle </li></ul></ul><ul><ul><li>Content Management System: like Omeka, Dspace </li></ul></ul><ul><ul><li>File System: Containing digital files (.wav, .pdf, etc.) </li></ul></ul>Switches and Routers, connected to Internet Service Providers or other Wide Area Networks, Academic Networks Internet (same thing as the other blob below)
    30. 86. CMS Infrastructure <ul><li>LAMP </li></ul><ul><ul><li>Linux – the operating system – like Windows or Mac OS X except good for web servers </li></ul></ul><ul><ul><li>Apache – the webserver – responses to http requests </li></ul></ul><ul><ul><ul><li>The Microsoft equivalent is IIS – Internet Information Server. Apache is run mostly on Linux and Mac Server, and occasionally on Windows. </li></ul></ul></ul><ul><ul><li>MySQL – the relational database management system </li></ul></ul><ul><ul><li>PHP – the programming language that the CMS is written in </li></ul></ul><ul><li>Contrast with WAMP, Server vs. Personal Computer </li></ul>
    31. 89. Outsourcing <ul><li>Create a detailed projected timeline </li></ul><ul><ul><li>What date you can expect each deliverable. </li></ul></ul><ul><ul><li>Don’t let the timeline slip; hold the vendor accountable for the timeline; ask for discounts if slips from timeline </li></ul></ul><ul><li>Create a detailed budget </li></ul><ul><ul><li>Itemize each component </li></ul></ul>
    32. 90. Handout example
    33. 92. Planning & Managing Digital Library & Archive Projects Creating an infrastructure: Technical, Organizational & Resource
    34. 93. Hollywood <ul><li>Fewer than half of the feature films before 1950 have survived </li></ul><ul><ul><li>Less than 20% survive from the 1920s </li></ul></ul>
    35. 95. <ul><li>One of the biggest movies of 1954. </li></ul><ul><li>Nominated for 6 Academy Awards, winner of 2 </li></ul><ul><li>Winner of 2 Golden Globes </li></ul>
    36. 96. Archival Masters <ul><li>With the advent of TV and ability to re-broadcast movies on TV, followed by advent of VHS players, Hollywood began to realize that there was a monetary incentive to keep archival masters so the film could be reproduced onto different media (TVs, VHS tape, DVD). </li></ul>
    37. 98. Film Preservation <ul><li>“ Film in the Freezer”, “Store and Ignore” </li></ul><ul><li>Private Vaults </li></ul>
    38. 101. Long term access <ul><li>Hollywood: Want to ensure archival masters for at least 100 years </li></ul><ul><ul><li>Most libraries and archive strive for something like “eternal” access. </li></ul></ul>
    39. 103. Challenge <ul><li>There is no hardware and software that can ensure long term access alone; the media will break down anywhere from 5 to 10 years. </li></ul><ul><li>“ Store and ignore” while concentrating on environmental conditions (like humidity & temperature) will not work. </li></ul><ul><ul><li>For example, magnetic hard drives cannot be stored on a shelf for longer periods of time. This is because the internal lubrication will be affected by “stiction,” where internal components lock up. Magnetic hard drives should be powered on a spinning. Still have a limited operational lifetime. </li></ul></ul>
    40. 105. Doing Digital Preservation <ul><li>Permanence in the digital sense means ongoing and systematic preservation process; an active management approach is required. </li></ul><ul><li>It is more like maintaining a car, than putting a book on a shelf. </li></ul>
    41. 106. Implications (1) <ul><li>That means that the data will be migrated on a schedule </li></ul><ul><ul><li>Factor migration time (labor), costs in budget and in strategic plans </li></ul></ul><ul><li>Should be talking in terms of $/TB/year </li></ul><ul><ul><li>Labor and electricity costs should be factored in, not just media costs </li></ul></ul><ul><ul><li>Should be including backup and other multiple copies you will be making </li></ul></ul><ul><li>Example last week was misleading, must always factor in time. </li></ul>
    42. 107. Implications (2) <ul><li>Media (CDs, DVDs, Blurays, Gold DVDs), hard drives, on a shelf or under a desk is not good digital archive strategy. </li></ul><ul><ul><li>If you see this, know that it is bad practice, and work to change it. </li></ul></ul><ul><li>(Trusted) Digital Repository that is (almost) always powered, redundant, and backed-up is the best strategy. </li></ul>
    43. 108. Implications (3) <ul><li>Heavy use is one of the best defenses against digital loss. </li></ul><ul><ul><li>Patrons will notice if something is amiss. </li></ul></ul><ul><ul><li>Complete opposite of physical preservation. </li></ul></ul>
    44. 109. Managing Digital Content <ul><li>Physical media is almost never an appropriate digital preservation strategy. Most commercial sites aren’t either. </li></ul>
    45. 114. Trusted Digital Repository <ul><li>You can make your own Trusted Digital Repository or join a group that has one. </li></ul>
    46. 115. Organizational Infrastructure <ul><li>Policy framework </li></ul><ul><ul><li>Mission statement </li></ul></ul><ul><li>Financial sustainability/framework (Columbia example) </li></ul><ul><li>Organizational viability </li></ul><ul><ul><li>Have a succession plan </li></ul></ul>
    47. 116. Technology <ul><li>Redundant hard disks </li></ul><ul><li>Backup, move to offsite, security </li></ul><ul><li>Physical security, staff w/security </li></ul><ul><li>Physical environment (Air conditioning, above 80 deg F, redundant) </li></ul><ul><li>Electricity (UPS, Backup generator, surve, voltage regulartor), Power is always on. </li></ul><ul><li>Piggy back on what IT is already doing, if they are doing a enterprise records management system (e.g., Banner, PeopleSoft, Datatel). </li></ul>
    48. 119. Evaluating your Project Planning & Managing Digital Library & Archive Projects
    49. 120. On Evaluating <ul><li>Evaluation is usually started after something has completed or have had time to be used. </li></ul><ul><ul><li>Used to inform decisions (replication, discontinuation, refinements, more investment, etc.) </li></ul></ul><ul><li>Alternative is to do mini-evaluations with user community as you develop. </li></ul><ul><ul><li>This can be a challenge if you don’t have a user community yet (e.g., have your mom try it out). </li></ul></ul><ul><li>Evaluation is not the same as usability </li></ul>
    50. 121. Evaluation Methods <ul><li>Quantitative: Analysis of numerical data (surveys, logs) </li></ul><ul><ul><li>Criticized for not getting at what people really think </li></ul></ul><ul><li>Qualitative: Analysis of words (e.g., interview transcript), pictures, objects </li></ul><ul><ul><li>Criticized for being biased, not representative </li></ul></ul><ul><li>Mixed Methods: Depending on decisions that you are trying to make, you may want to triangulate (use multiple methods to get at what you are looking for). Example: Survey, Focus Groups & Transaction Log Analysis. Of course, ability to do all that is budget & time constraints. </li></ul>
    51. 122. Sampling <ul><li>Whichever method you use, sampling is important </li></ul><ul><ul><li>Get a representative sample that accurately represents the entire population </li></ul></ul><ul><li>Sampling is not important where you capture 100% of the data, such as in transaction log analysis </li></ul><ul><li>Qualitative Methods </li></ul><ul><ul><li>You can remove the interpretive bias by using formal qualitative data analysis methods </li></ul></ul><ul><ul><ul><li>Use independent coders of transcripts to see the extent to which your interpretations coincide. </li></ul></ul></ul>
    52. 123. Compare alongside past projects
    53. 124. Thank you. Anthony Cocciolo [email_address]