The single most pressing challenge facing institutional repositories today concerns content recruitment. The current system of scholarly communication is well established. To be successful institutional repositories need to occupy a more central role in scholarly communication, and to do that, they must be populated with material that scholars find relevant.
Many institutions have found that while faculty are often enthusiastic about the prospect of establishing an institutional repository; actual participation in the form of contributing content to the repository is far more haphazard. While a variety of explanations have been offered for this, one of the most basic reasons is that researchers are focused on their work and communicating with colleagues, not on building a repository for their institution. In light of this realization, institutional repositories have begun exploring ways to boost participation.
Education of the faculty and other researchers about the goals of the repository and the benefits of making their work more freely available are, of course, crucial. But, in general, such efforts have not been sufficient, and we can expect to see increasing use of other outreach methods in the future.
One option is to work toward establishing policies that mandate the deposit of certain types of materials in the institutional repository, along the lines of the recent mandate adopted by the Faculty of Arts and Sciences at Harvard.
A second approach is to introduce a mediated deposit service in which librarians supply a range of services including digitization of paper items and individual copyright counseling. In such a system, librarians will also submit items on behalf of contributors in an attempt to lower the barrier to participation. The library where Wendy and I work is in the process of developing a repository now, and is attempting to address this by building procedures that will make contribution as easy as possible for researchers by taking responsibility for much of the “busy work” out of their hands.
Another likely area for future development is the creation of services that allow contributors and others to build a network for sharing learning materials and collaborating with colleagues at their institution. Such an approach would also help create a more meaningful context for some of the materials currently housed in institutional repositories that do not appear to hold immediate scholarly value.
Overall, it’s an exciting time to be working with institutional repositories, although there are serious challenges to be met. As more institutions establish them and improve techniques for both filling their repositories with quality scholarly content and helping researchers use the material in them, we can expect institutional repositories to play an increasingly important role in scholarly communication and in the intellectual life of our research institutions.
This last step, that of researchers actually using the information housed in institutional repositories is a critical element to their success. In order to occupy a meaningful place in the process of scholarly communication, institutional repositories need to fit in with the overall scholarly landscape and not just be isolated outposts housing documents that are difficult to discover and unused by researchers.
The use of OAI-PMH compliant metadata makes it possible for the contents of institutional repositories to be discovered by researchers using any number of search interfaces.
Probably the most well known of these is OAIster, a service that allows users to search across more than a thousand contributing repositories. Similar capability is available through searching directories of institutional repositories like OpenDOAR. Repository content can also be found through specialized search engines like Scirus, or by using Google.
There a few challenges related to searching for material in institutional repositories, however. First, while the use of unqualified Dublin Core makes interoperability possible, it also limits the degree of precision one can use in a search. Another issue is the fact that institutional repositories contain all manner of documents, some of great scholarly value and some with minimal scholarly relevance. These issues likely underlie the somewhat disappointing levels of use by scholars, and are among the challenges that institutional repositories must struggle with in order to become truly successful.
So, your new repository software has been selected and set up on your server … but how exactly does it work? A couple of weeks ago, Candy discussed the Open Archival Information System model. This same model structures the workflow of an institutional repository.
Let’s start at the beginning. A well-informed faculty member with a strong commitment to open access (thanks, no doubt, to her library’s persuasive information sessions) has just finished a paper, for example. When submitting it for publication in the journal of her choice she has taken care to amend the journal’s copyright transfer agreement to allow her to deposit a post-print copy of her article in her college’s repository.
After publication in the journal, she takes the post-print PDF of her paper (known in the terms of the OAIS model as a Submission Information Package), and deposits it in her college’s customized DSpace system by entering the required metadata and uploading the file.
Once the file is ingested into the system, a repository librarian reviews, verifies, and edits the item, and clears copyright and licensing issues or turns these questions back to the submitting faculty member. Metadata is added and the object is stored in the system in accordance with digital preservation standards. The Archival Information Package, as it is now called, is now searchable by community, date, title, author, and subject.
A few weeks later, when a colleague searching for our researcher’s paper queries the repository, either using the local interface or a harvester such as OAIster, the item is called up as a Dissemination Information Package and served up to them on their desktop. Surrounding and supporting this workflow are policies, procedures, services, and the technical infrastructure required to maintain this system.
DSpace is by far the most commonly used institutional repository system. It can run on most web servers, and it is probably the system that is best placed to take advantage of the benefits open source software, since it has the largest and most open development community. It is well documented and has a very active users group. These circumstances make it much easier for new users to find assistance when starting up.
The concept of a community of users supporting the software was built into the system. DSpace actually integrates a user community into the system architecture, an approach none of the other platforms have taken. This allows for the various departments of a research institution to participate in a way that is customizable to the needs of each separate unit.
DSpace also has the benefit of being created by MIT and Hewlett-Packard, with some ongoing financial support from the Andrew W. Mellon Foundation. This has drawn the involvement of eight core universities that are working to evaluate DSpace in the context of various research institutions. All of this collaborative work means that a rich feature-set has developed that is compatible with the needs of such institutions, most notably a focus on long-term preservation of research materials.
Finally, ongoing support is an institutionalized priority. Last year a non-profit foundation was created by MIT and Hewlett-Packard to provide support for the growing DSpace community.
One of the most important choices faced in establishing a new institutional repository is what software platform to adopt. At present, there are four leading repository systems. While they all address the same basic need, they differ from one another in terms of their technical details, and each one offers varying degrees of customization and developer or user-community support. When selecting an institutional repository system, many of the issues raised during last week’s open source software presentation hold true. One must examine not only the features offered, but also the availability of implementation help and ongoing support.
Of the leading four software platforms Digital Commons is the only one that is not open source. Developed by the Berkeley Electronic Press, this system is offered as a fully hosted application with unlimited customer support. Berkeley Electronic Press does not just offer technical support–they will also assist institutions in engaging faculty and harvesting content. They claim to be able to get a repository up and running within 1 to 2 weeks with only an hour long staff training before an institution can begin adding content. For an institution with a small IT department, this is very appealing option. The trade-off, however, is that Digital Commons does not offer much in the way of customization, as the code is closed and sits on Berkeley Electronic Press’ servers.
EPrints and FEDORA, on the other hand, are both open source. EPrints runs on a standard web server and requires no additional software aside from the EPrints base install. EPrints is thought of by many as the least complicated of the open source options and is well documented. However, it does not benefit from a wide developer base. Although the code is touted as open source, the University of Southampton works in a closed community and does not accept modifications to the base code. This could be a problem for an institution that needs a high level of customization when developing their repository.
The main difference between FEDORA and EPrints is that FEDORA is a web service, which means it needs a user interface built on top of it to have a manageable front-end. While FEDORA also runs on a standard web server, another application needs to be selected and installed for it to function fully. FEDORA is considered the most robust and highly customizable of the open source repository systems, and it might be a good option for an institution with a strong IT department and the technical staff to support it.
So what does it take to start up an institutional repository? Like any successful project the relevance of the repository needs to be clearly explained to the institution and contributors. This is particularly important because the researchers of the institution need to be active partners in adding material to the repository. Librarians will need to address concerns around copyright, journal contracts, and redundancy. Most importantly, they will have to demonstrate the value of open access to the institution’s scholarly output. Demonstrating the relevance of the repository and clearing up confusion about open access and scholarly publishing practices is also an ongoing maintenance practice and doesn’t end with the establishment of a functioning repository system.
Librarians will need to define policies, establish procedures and develop services for maintenance of the repository. The most important policies may be the ones that affect the contributors of content, as these policies will have to be clearly communicated across many organizations within the institution. This includes submission policies regarding acceptable content, file formats, version control, and required author metadata. Repository librarians will also need to design workflows that include review and editing processes for incoming objects. Ongoing work will be required to educate researchers in informational sessions on copyright and open access issues. Institutional policies will need to be established regarding whether the repository will be populated by voluntary self-archiving or through a mediated deposit process, in which librarians actually add objects to the repository on behalf of contributors.
Once the framework is in place, the repository infrastructure needs to be built. Repositories use OAI-PMH compliant systems so that their metadata can be harvested and cross-searched by services like OAIster. Specialized staff is needed to review, edit and structure the objects, customize systems, and add appropriate metadata. It is also crucially important that the project includes staff skilled in marketing, outreach, and liaison services to help recruit quality content.
The ideas that led to the development of institutional repositories grew out of the open access movement with the goal of enhancing free scholarly communication. The earliest precursors of institutional repositories were disciplinary repositories, such as Working Papers in Economics (now known as EconPapers) and ArXiv for the physics, mathematics, & computer science communities.
In time, the development of software platforms like EPrints, FEDORA, and DSpace helped lower the technological barriers for organizations wanting to establish repositories. Meanwhile, rising journal costs provided greater incentives for universities and libraries to work toward developing alternatives to the journal system. The confluence of these and other factors opened the road to the development of early institutional repositories, which began to emerge in the early years of this decade, some notable examples including E-Prints at Australian National University in 2001, and DSpace@MIT, the University of Edinburgh Research Archive, and eScholarship at the University of California, all launched in 2002.
Recently, interest in institutional repositories is growing in light of current developments such as the NIH mandate to make the products of taxpayer funded research freely available to the public and universities like Harvard urging their faculty members to deposit papers and other publications in institutional repositories.
Institutional repositories share many features with digital libraries, but they can be distinguished from the sort of digital library we’ll be building in this class by a couple of important characteristics.
As we’ve already mentioned, institutional repositories are designed primarily to collect, preserve, and make available the scholarly output of a given academic institution. In contrast, ordinary digital libraries may be organized around any number of principles, some of the most common being topic, subject, discipline, or even a particular work or document.
Institutional repositories and digital libraries also differ in regard to how they acquire content. While the collections contained in digital libraries are generally the result of deliberate collection development efforts on the part of those operating the library, institutional repositories are typically dependent upon the voluntary contributions of researchers. While some institutions have attempted to require the deposit of certain types of materials, most depend upon voluntary participation. This fact is behind one of the greatest challenges facing institutional repositories today, that is, the relatively low rate of contribution by researchers. We’ll discuss this a bit more later on.
Another difference between institutional repositories and digital libraries is that institutional repositories are fundamentally a place to store materials. Consequently, there may be minimal services offered to users. In contrast, digital libraries will often offer services to users, ranging from reference and research support to bibliographies of related materials to the sort of interpretive content and teaching resources we’ll be including in our digital library.
First, as the Crow definition stressed, they are organized around an academic or research institution, and are intended to house materials that represent the intellectual output of that organization. Second, they are a necessarily a collection of documents and objects, typically of varying types and formats. Researchers affiliated with the sponsoring organization might deposit texts, data sets, sound files, images, or any number of other items.
Significantly, these documents may be from any stage in the process of scholarly inquiry and may therefore carry varying levels of scholarly authority. For instance, while peer-reviewed published papers may be included, searchers will typically also find pre-prints, conference presentations, theses and dissertations, working papers, course materials, organizational records, or anything else a contributor may have chosen to deposit.
Finally, institutional repositories are closely tied to the ideals and goals of the open access movement and the belief that scholarly communication should be as open and free as possible.
Institutional Repositories
Wendy Brown
Jen Langley
Joshua Parker
October 2, 2008
LIS 462: Digital Libraries (Schwartz)
Graduate School of Library & Information Science
Simmons College, Boston, MA
Institutional Repositories
What is an Institutional Repository?
Institutional repositories [are] ... digital collections
capturing and preserving the intellectual output of a
single or multi-university community. (Crow,
2002).
A university-based institutional repository is a set of
services that a university offers to the members of its
community for the management and dissemination of
digital materials created by the institution and its
community members. It is most essentially an
organizational commitment to the stewardship of these
digital materials, including long-term preservation
where appropriate, as well as organization and access
or distribution. (Lynch, 2003)
Institutional Repositories
Institutional Repositories are:
• Centered around a university (other academic
institution) and contain items which are the scholarly
output of that institution
• A collection of (digital) objects, in a variety of
formats
• Include works of various degrees of scholarly
authority and from various stages in the process of
scholarly inquiry. In addition to published works, an IR
may include preprints, theses & dissertations, images,
data sets, working papers, course materials,or anything
else a contributor deposits
• Typically motivated by a commitment to open access
Institutional Repositories
IRs & Digital Libraries
Institutional Repositories Digital Libraries
• Are organized around a • May be built around any
particular institutional number of organizing
community principles (often topic,
• Often are dependent upon subject, or discipline)
the voluntary • Are the product of a
contribution of deliberate collection
materials by scholars for development policy
the content in their • Typically include an
collection important service
• Are mainly repositories aspect (reference and
and therefore may only research assistance,
offer limited user interpretive content, or
services special resources.)
Institutional Repositories
Origins & Development
OA
Open access movement
and free scholarly
communication
Disciplinary
Repositories
Software
development
Institutional
Repositories
Legal and
Institutional
Deposit Policies
Institutional Repositories
Starting & Maintaining an IR
Steps to Building an IR Key Issues:
1. Justify the relevance to the • Faculty buy-in
institution and
contributors • Submission polices
• Intellectual Property
2. Develop a policy issues
framework. How will we • Mediated deposit
find this content and • Metadata
what will we do with it?
• OAI-PMH compliant
systems
3. Build the infrastructure
• Specialized staff
• Outreach and Liaison
Bonus: Get institutional
support and a mandate. services
Institutional Repositories
Four Widely Used Systems
Produced by Berkeley Electronic Press (bepress), focused on
maintaining scholarly output. Not open source.
Developed at the University of Southampton (UK). Widely
considered to be the least complex of the major repository software
platforms.
Developed at Cornell and University of Virginia. Based on a
framework known as the Flexible Extensible Digital Object and
Repository Framework.
Designed by MIT and Hewlett-Packard to manage the intellectual
output of research institutions and provide for long-term
preservation.
Institutional Repositories
DSpace
Institutional Repositories
How Does an IR Work?
Submission and Ingestion
contributor metadata
formatting
copyright
Post-Submission
quality metadata (DC)
Intellectual Property issues
User Query
Ongoing workflows
Preservation
Administration
Data Management
System customization
Institutional Repositories
Searching Across Multiple IRs
The use of OAI-
PMH compliant
metadata permits
“one stop shopping”
Institutional Repositories
Future & Challenges
“We can open an empty library building, and we can
market its existence all over creation, but the mere act of
doing so won't fill the shelves!” (Dorthea Salo)
Librarians care about open access, while researchers care
primarily about their field. How do we ensure that
investigators contribute to and use materials in
institutional repositories?
The Next Steps:
• Content Recruitment and Advocacy
• Mandates
• Mediated Deposit
• Networked communities of teaching and learning
Institutional Repositories
Questions?
Annotated resource guide available at:
http://web.simmons.edu/~parker1/
coursework/LIS462_IR_resource_guide.pdf
12 comments
Comments 1 - 10 of 12 previous next Post a comment
Comments 1 - 10 of 12 previous next