Beyond the Scanned Image: A Needs Assessment of Faculty Users of Digital Collections


Presentation given by Harriett Green and Angela Courtney at the Digital Humanities 2013 conference.

  • As we know, digital collections are evolving rapidly in their format, content, and methods of use: From the many specialized digital collections created by museums, libraries, and archives to the newly launched Digital Public Library of America, more and more digitized content has been transferred from the archives to the computer screen.
  • The Bamboo Technology Project, also known as Project Bamboo, is a multi-year Mellon Foundation-funded research initiative to build an e-research platform for humanities scholarship, which for our purposes includes the arts and humanistic social sciences. The Project Bamboo participants are an international consortium of research institutions in the U.S., UK, and Australia and research team members include teaching faculty, librarians, and academic technologists. The project began with a series of workshops/focus groups withscholars on how they conducted humanities research with digital materials (these planning reports are available on the Project Bamboo’s website).
  • To determine how digital collections should be prepared to interact with the Bamboo platform, we launched a study this fall to examine these questions: in particular, what do humanities scholars need in the functionalities and features of digital collections in order to use them for humanities research?
  • I am the lead investigator of this study, and I am working with librarians and academic technologists to conduct it at these twelve research universities from the CIC and Bamboo partner institutions.
  • The study components consist of a online survey and forthcoming individual interviews with humanities faculty. The survey is the part of the study that I’ll be discussing today.The survey, which ran from mid-October through the end of December, was distributed to a random one-third of the English and History faculty at the aforementioned research universities and drew 62 responses. In the survey, scholars were asked if and how they used digital collections in their research, and I’d like to highlight just a few of the results from the preliminary data analysis:
  • Now when respondents were asked about the specific uses of digital collections in their own research, these are some of the responses, including:“I frequently need to use Japanese language texts not available in our university library (some not available in the U.S. at all). These materials can be found in digital archives maintained overseas.”As these sample responses indicate, scholars use digital collections as both critical sources and research tools: they mine mass collections of material, verify transcriptions, and access materials that previously would have required a summer travel fellowship.
  • When asked how frequently they used digital versus print resources, approximately 37% of respondents said they used print and digital resources in equal amounts, while 29% said that they used digital resources less than half of the time and 20% said more than half of the time. This data suggests that most notably, humanists utilize deeply interconnected and hybrid research practices that must be kept in mind as we develop digital collections.
  • When respondents were asked what types of digital materials they most frequently used, texts were the top vote-getter, followed by images and maps.
  • Respondents were then asked to identify the three top features needed for digital collections in each of the following formats: texts, images, and multi-format media (which includes videos, audio files, etc.). For collections of text, the ability to download files and granular search systems were paramount. Some of the respondents’ comments included:“The ability to annotate these texts in private and public environments, where I could keep stuff for myself or share it with others.”“Use of JavaScript or some system that is compatible with software loaded on most research faculty's computers (I have colleagues who often work on out of date computer systems at home)”
  • For collections of images, the most frequently identified functionalities were viewing and zooming tools, high-quality scanned files, and ability to edit and crop images. Some of the respondents’ comments are here and included:“Imaged books must include flyleaves, covers, and marginalia.” “information on permissions and how to request them”
  • And for multi-format media such as video and audio files, the answers were more wide-ranging, but included comments here such as:“ability to embed in other sites (I use my own websites for drafts of work in multi-media)”This also highlights a thematic need that repeatedly emerged in the overall survey data: a desire to take these digital materials and reshape and reconstitute them in visceral ways for research.
  • This study is still ongoing with more survey data to be analyzed and a round of forthcoming interviews. But this early data—including this quote here—suggests that humanities scholars are utilizing digital resources in complex ways that mine and manipulate materials in order to weave new webs of intellectual conjecture,.This evolution in scholarly practice lends support to arguments for a humanities cyber-infrastructure and virtual research platforms that has been/will be discussed in this session by Dean and Doug, was a big theme on Thursday at the Future of Higher Ed panel; and has been investigated in research reports by the ALCS and others: Collaborative, networked spaces where humanists can work and interact.
    7. 7. • ―Topic modeling, Dunning's comparisons, mining word-frequency correlations, assessing changes in style and thematic content over time ‖ • ―I study the mass media, so I rely on databases of old television and radio recordings.‖ • ―I am a historian and I use digitized collections of newspapers, magazines, government reports and documents, legal cases, legal manuals and handbooks, books, etc.‖ • ―High definition images of papyrus as basis for textual reconstructions.‖
    8. 8. • Use of digital materials versus physical materials ―I use the "real" objects if possible. I use digital reproductions if I don't have access to the originals. Since I don't have access to many of the original objects and images I want to study, I would guess about half and half.‖
    9. 9. • Benefits of Original • Completeness of contents • Sensory experience • Integrate print and digital ―I still must and do consult originals, but for teaching I have always had to rely on surrogates, and for my research, the availability of databases particularly for periodicals, historical dictionaries, and other online primary and secondary sources generally has been an enormous boon.‖
    10. 10. • Benefits of Digital • Portability • Wide accessibility • Preservation ―I teach contemporary literature so no one expects me to assign a given writer’s drafts of a published novel. Why should it be any different when it comes to digital images of art that may be housed in distant museums or digitized manuscripts that may be too frail to sustain the wear and tear of hundreds of freshmen’s hands?‖
    11. 11. 0 36 36 20 7 Frequency of Use of Digital Materials Never Less than half Equal for print & digital mat. More than half Always
    12. 12. 100% 93% 58% 42% 39% 4% 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Texts Images Maps Video Audio Others None Types of Digital Materials Used
    13. 13. • ―links to related material or bibliographic resources‖ • ―metadata must include rigorous bibliographic information and stated principles, either explicitly derived from existing modern critical editions or stated as unique to this database.‖ • ―good word search capability allows me to find all instances of the use of a particular term in a given document collection or diary, narrow dates, search for names (and alternate names) of key individuals, etc. ‖ • ―capability to be read on multiple devices (e.g., iThing, laptop, desktop).‖ • ―The ability to annotate these texts in private and public environments, where I could keep stuff for myself or share it with others.‖ • ―The ability to conduct corpus wide inquiries is still severely limited. Search tools are good for finding needles in haystacks, terrible for extracting data for subsequent manipualtions
    14. 14. • ―Images must cooperate with both Powerpoint and Word.‖ • ―Ability to export in multiple formats (i.e. jpegs and higher resolution tifs)‖ • ―information on permissions and how to request them‖ • ―ability to have a folder of images in the collection to which I can return that contains all of the copyright and original publication material associated with the image‖ • ―viewing and zooming tools, color accuracy, ability to export, reliable metadata‖
    15. 15. • ―video ability to moderate sound and size‖ • ―ability to embed in other sites (I use my own websites for drafts of work in multi-media)‖ • ―ability to export and listen/view in another session‖ • ―Annotation tools‖ • ―detailed metadata, a time counter for audio and video, downloadable‖
    16. 16. • Areas to Improve • Searchability • More content • Annotation and editing tools • Current challenges • Poor interfaces and user design • Incomplete contents • Too much data?
    17. 17. • Phase 1 Workshops held in 2008-2009 • Qualitiative data from focus groups and workshops gathered and analyzed by Quinn Dombrowski, Scholarly Practices in the Humanities: Directions, Trends and Opportunities internal report • Gathered data from workshop discussion of scholarly research practices • Comparative analysis of workshop data and survey data – How do we map expressed scholarly practices to expressed scholarly needs?
    18. 18. Searchability Gathering/Foraging Synthesizing/Filtering Tools for visualizing relationships Contextualizing Conceptualizing, Refining, Critiquing Research Documentation Tools Documenting Methods Metadata Managing Data Annotation Tools Annotating/Documenting Modeling/Visualizing Teaching/Research Interoperability Sharing/Publishing Funding Collaborating Citation/credit/peer-review
    19. 19. • ―The trick is that the digital collection has to be sufficiently large (pulling from multiple sources) for it to be useful and often it would be preferable for the collection to have the complete run of a magazine so that I can compare media from different decades.‖ • ―The ability to control your collection, set up your own library and so on and go deeper and deeper, adding tags, etc. Where it’s less of a skill and more of an expectation. ―Personalization of text that in print is crucial.‖ That’s a big gap.‖ • ―Better interfaces that allow more flexibility and manipulation of digital images, access to geographic systems and maps to spatialize knowledge in ways that cannot be done with print material.‖ • Participatory tagging and other forms of shared knowledge building.
    20. 20. • ―The challenges are (a) finding resources to make high- quality digital collections, (b) making those digital collections usable and useful, which requires (c) training humanists to be deep designers of technologies.‖ • ―I theorize models of historiographic placement, searching, and locatability; so I also consider the various functionality of digital tools.‖ • ―The easier objects are to repurpose, remix, and reuse the better.‖
    21. 21. CLIR, The Idea of Order: Transforming Research Collections for 21st Century Scholarship ―Enabling anything like seamless access to the cultural record will require developing tools to navigate among vast catalogs of born-digital and digitized materials, as well as the records of physical materials.‖—ACLS, Our Cultural Commonwealth report (2006)
