software developed at University of Southampton to enable anyone to set up
their own Open Archives-compliant institutional archive. Originally programmed for subject repositories but now re-engineered for IR. D oes not identify treatment of datasets, though can cover bibliographic description
DSpace: Durable Digital Depository [ http:// dspace.org / ]. Open-source software developed at MIT for their own repository; released as open source software in Nov. 2002.
Overtly identifies datasets. Offers opportunity to explore the issues surrounding the
incorporation of different metadata standards within one system…. Different disciplines have adopted different sets of metadata standards to accommodate their particular data needs.
Two examples are the CSDGM standard for geospatial data and the DICOM standard for digital imaging in medicine. … develop more general standards, such as Dublin Core, which
proposes a basic set of common elements that can be used across many different disciplines and document types.
(DC and MARC are norms)
https://dspace.ucalgary.ca/handle/1880/33 need to register to search
http://careo.ucalgary.ca/cgi-bin/WebObjects/CAREO.woa - information products
Subject - arXiv, Cogprints, RePEC,
Institutional – Southampton, Glasgow, Nottingham (SHERPA), MBA UK
National - DARE (all universities in the Netherlands), Scotland, British Library (proposal)
National / Subject - ODINPubAfrica
International - Internet Archive ‘Universal’, OAIster
Conference - 11th Joint Symposium on Neural Computation, May 15 2004
Personal – peer to peer, web pages etc
Media Type - VCILT Learning Objects Repository, NTDL (Theses)
Publisher – journal archives
Data Repositories/Archives - NODC, BODC, DOD, JODC, BADC etc
Science, particularly Environmental Science is well served
Logical host for numeric datasets
Data Centres/ Archives / Repositories
Within organisational infrastructures but not defined by it
Subject and Technical Specialists, quality control of content
Secure storage and migration policies
Well developed Metadata schema & Standards
DIF – Directory Interchange Format, FGDC etc
the minimum set of metadata required to serve the full range of metadata applications (data discovery, determining data fitness for use, data access, data transfer, and use of digital data);
optional metadata elements - to allow for a more extensive standard description of geographic data, if required;
a method for extending metadata to fit specialized needs.
Though ISO 19115:2003 is applicable to digital data, its principles can be extended to many other forms of geographic data such as maps, charts, and textual documents as well as non-geographic data.
“ a trusted repository” supported by the Data Management Community
ARCHIMEDE : A Canadian software solution for institutional repositories [ http:// archimede.bibl.ulaval.ca/di/Welcome.do ]. OAI compliant software developed by Laval University Library. Archimede has been developed in a multilingual perspective, with internationalization as a focus. The text (or content) of the interface is independent and not embedded in the code making it relatively easy to develop an interface in a specific language without having to work on the code itself. English, French and Spanish interfaces are already offered in Archimede. That feature allows also the user to switch easily from language to language anywhere and anytime during his search and retrieval process.
Berkeley Electronic Press [ http:// www.bepress.com/repositories.html ]. Commercial OAI-compliant software used by the University of California’s eScholarship Repository .
CERN Document Server Software (CDSware) [ http:// cdsware.cern.ch / ]. OAI compliant software developed by, maintained by, and used at, the CERN Document Server.
Project Tapir [ http://sourceforge.net/projects/tapir-eul ]: Tapir provides additional functionality to digital asset management software DSpace primarily designed for Electronic Theses and Dissertations supervision, submission and dissemination. See Queen's University Project .
Fedora™ Project: An Open-Source Digital Repository Management System [ http:// www.fedora.info / ]. Jointly developed by the University of Virginia and Cornell University, Fedora is a general-purpose digital object repository system that can be used in whole or part to support a variety of use cases including: institutional repositories, digital libraries, content management, digital asset management, scholarly publishing, and digital preservation.
Greenstone [ http://www.greenstone.org/cgi-bin/library?a = p&p =home ]. Suite of open-source multilingual software for building and distributing digital library collections. Produced by the New Zealand Digital Library Project at the University of Waikato, and developed and distributed (since 2000) in cooperation with UNESCO and the Human Info NGO. Presently in limited use at New Zealand Digital Library Project and some other sites.
OCLC Research Software [ http:// www.oclc.org/research/software/default.htm ]. A list of open source software developed by the Online Computer Library Center (OCLC) to build a repository and harvest data according to OAI-PMH standards.
FIGARO, i-TOR, etc
Dilemma for Researcher
Mandates from major funding agencies now require grantees to deposit research output in a ‘designated repository’ or ‘any’
Wellcome Trust (UK PubMed) - £400 million producing 3500 papers per year
Where should the full text of their research be deposited
Researcher wants to enter metadata and deposit only once and perhaps deposit all related material in one place?
Situation at present
Harvesting, but harvester is not the choice of the depositor
Duplicate keying metadata into repositories of choice
Cannot target multiple repositories with one exercise
Does it matter where it is deposited since Google Scholar, Yahoo, Scopus , will pick it up wherever it is?
Repositories taking over the world?
Not between Institutional and Subject Repositories – complementary and should coexist
Possibly between Text based and Numeric based repositories
Repositories of whatever flavour v. Data Centres
Are both spilling over into each others territory?
CLADDIER Project ** ( C itation, L ocation A nd D eposition in D iscipline and I nstitutional R epositories )
The CLADDIER system will be a step on the road to a situation where (in this case, environmental) scientists will to be able to move seamlessly from information discovery (location), through acquisition to deposition of new material, with all the digital objects correctly identified and cited. The lessons learned will be of applicability for the relationships between other discipline based repositories and institutional repositories .
**JISC Digital Repositories Programme 2005 -
Automated Linking both ways
Where to Deposit
One outcome of CLADDIER Project
‘ pull’ = Harvesting
‘ push’ = CLADDIER outcome
Enable researcher to deposit in one repository and choose to upload (push) the metadata to another repository of choice.
Logical to ‘push’ from IR to Subject?
Redundancy of records?
Pauline Simpson ( [email_address] )
Discovery metadata - What data sets hold the sort of data I am interested in? This enable organisations to know and publicise what data holdings they have.
Exploration metadata - Do the identified data sets contain sufficient information to enable a sensible analysis to be made for my purposes? This is documentation to be provided with the data to ensure that others use the data correctly and wisely.
Exploitation metadata - What is the process of obtaining and using the data that are required? This helps end users and provider organisations to effectively store, reuse, maintain and archive their data holdings.