Six Use Cases for Edinburgh DataShare


Published on

Presentation at Repository Fringe, Edinburgh: 1st August, 2013

Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Edinburgh DataShare is a free-at-point-of-use  data repository service which allows University researchers to upload, share, and license their data resources for online discovery and re-use by others. It was set up in 2008 as an exemplar of institutional data repositories, using DSpace, with partners at Oxford and Southampton working with Fedora and EPrints.
  • The University RDM Policy has implications for the provision of the data repository service.
  • EdinburghDataShare is a key component of the Data Stewardship component of the University RDM Roadmap. Legacy datasets can pose challenges for deposit and are not considered important for policy.
  • This pilot has challenged us on a number of usability issues for deposit: easing the burden of making decisions and making our instructions and hints as clear as possible. Making it easy to skip fields that are not relevant. Provided user guide with screenshots and checklist for deposit.
  • This pilot user had an audio archive that was well-curated and ready to be made open. Collection already had a ‘home’ in a trusted disciplinary repository, though ours was made public first. User was happy to give the collection greater visibility, as long as he didn’t have to upload files one by one.
  • User is already delivering files to specialist peers via website. Legacy datasets have existing licences embedded in headers; customised by University lawyers ten years ago. We are grappling with the desire for user registration in an open repository.
  • Some research considered ‘sensitive’ because of use of animals: not wanting to attract unwanted attention. Many large datasets saved in various places without archiving. Can/should the repository offer their storage solution for large and exponentially growing datasets, so long as they make it open, or should some appraisal step be introduced? The institute is wondering if they should be serving their own data for a price.
  • The research centre in Taiwan which serves the data during the life of the project may not feel obliged to make the data available long-term. The PI has offered to deposit a 5% sample of the data only. Could this be a good example for an external website maintained by others providing the search mechanism to retrieve objects within the repository? Do we need to alter some aspects of repository behaviour to accommodate this collection and the balance for searching across the repository, pagination of item listings, etc.?
  • This community is considering Edinburgh DataShare as one of several options for solving a range of problems to do with its research data.
  • Six Use Cases for Edinburgh DataShare

    1. 1. Robin Rice EDINA and Data Library, University of Edinburgh Repository Fringe 2013: 1-2 August, Edinburgh *
    2. 2. *The data repository and University RDM policy “9. Research data of future historical interest, and all research data that represent records of the University, including data that substantiate research findings, will be offered and assessed for deposit and retention in an appropriate national or international data service or domain repository, or a University repository.”
    3. 3. * Edinburgh DataShare is seen by the RDM Steering Group as one of the key RDM services offered by Information Services, and as such has challenged us to meet the requirements of a number of pilot submissions from a range of different types of research communities with particular types of data.
    4. 4. * Single item deposit, the dataset behind an article. Desire to get students to deposit their data from theses as norm - need unambiguous deposit workflow. Fieldwork in NHS means much data is „sensitive‟. Permanent embargoes? Dr. Nuno Feirrera, Teaching Fellow
    5. 5. * Dr. Bert Remijsen Chancellor’s Fellow Village of Fafanlap, Indonesia, on Bert‟s home page Dinka Songs of South Sudan collection, 62 items. Used Dspace collection template for metadata; files uploaded by assisted deposit. Also deposited in Max Planck specialist language repository. Annotations in specialist format, requiring software from Max Planck to read. User happy with download statistics, referring colleagues.
    6. 6. * *The Listening Talker collection identified for deposit, ongoing. *Very large video files plus software as VM image. Tar gzipped files containing millions of files. Several GB in size. *Desires user registration, non- standard licenses and checksums with downloads. Prof. Simon King
    7. 7. * *Lots of „omics data: not as many subject repositories to hold these as thought – storage cost concerns. *Interested in push-pull of metadata to websites, from CRIS *Spearheaded by Data Manager Dolly the Sheep
    8. 8. * *Fish4Knowledge EU-funded research project *Long-term sustainability issues for observational data *Search engine maintained on their website – using METS feed to locate items *Testing SWORD implemen- tation, 5% sample >10K files, video + sql rows (3 TB) *Efficiency & performance Prof. Bob Fisher
    9. 9. * *New member of University *Digital asset management needs *Nature of research data in the arts *Streaming & display requirements (high quality desired)
    10. 10. * *Usability & user education *Encouraging user to document and future-proof *Relationship of IRs and and subject repositories *Closed collections, length of embargoes, user registration in an open access service *Enhancing repo. functionality while developing new systems (storage, data asset registry) *Repository as golden copy/format *Preservation procedures and SIPs, AIPs, and DIPs