What does it mean● Building a solution to store research data● data can be looked after● found again and shared● searched for● can be reused● a research data management solution aka a data repository
But dont we have a repository already?We do.The digital collections site allows researchers to deposit copies of journalpapers. The current project aims to do the same for research data.Each dataset will have a unique and persistent identifier as an aid tocitation.
Yes but why ?Increasingly researchers overseas are being required to make their dataavailable in order to support their mainstream publications.In the US the NSF now requires researchers to submit a datamanagement plan as part of their grant submission.In the UK, JISC is funding a range of initiatives to build a set of dataarchives both within institutions and across disciplines
The Galathea story ... A research Institute in Denmark mounted a set of three oceanographic surveys, in 1845, 1950 and 2007. We stiil have all the 1845 and 1950 data as it was written in paper lab books Most of the 2007 data has been lost as it was stored on peoples laptops, not in a central archive ...
The ANU Data CommonsAim is to put in place the building blocks of a data management strategyfor the ANU ● A repository ○ based on fedora-commons ○ uses standard technologies ● A supporting policy framework
The ANU Data Commons● Built on the back of two ANDS funded projects ○ Seeding the Commons ■ Identify existing datasets, including orphan and legacy datasets and publish descriptions of them in Research Data Australia ■ Descriptions effectively electronic catalogue cards ○ Data Capture ■ Build workflows and mechanisms in Earth Sciences, Optical Astronomy, Phenomics, and Digital Humanities to capture research data as it is generated and publish it
Seeding the CommonsAim to be as self service as possible - built around concept of self deposit: ● user identifies themselves (logs in) ● creates a project description as free text and other informaton ● uploads data (aim is to be as simple as YouTube or Flickr) ● record is published to Research Data Australia ● data can be searched for and found againDataset owners can modify object they have created, for example to add asecond results file from a re run of a particular experiment.
Data Capture Deposit modelAim to be as self service as possible - set up automated capturemechanisms for instruments and at the same time leverage off Seedingthe Commons architecture ● User creates a project ● Enters information about the project including links to related documents ● Uploads data ● Publishes dataDataset owners can modify object they have created, for example to add asecond results file from a re run of a particular experiment.
Data CaptureBasically more of the same ● data goes to short term store ● is processed ● is uploadedand it can be automated viaa quasi drop box solution !
Data Capture - not just dataData needs context.Context is metadata, and includes things like the instrumentconfiguration, settings and so on.Some formats eg WAV for audio and FITS contain a lot ofthis information in the file by default - embedded metadata
What does it mean for me? ● Able to deposit existing research data sets ○ know that they are stored securely ○ can be read again - immune to format and media changes ○ most important - can be reusedAccess can be ● Open - anyone can download and access it ● Embargoed - access is restricted until a particular date ● Restricted - people have to ask to access the data. Valid reasons include cultural, ethical and commercial issues
OK I like this - can I use it?Not yet....today we have an internal alpha (soon to be beta) solution ● can create object record ● import object to repository ● search for object in repository ● create collection record in RDAAim is to have public beta mid 2012Aim to also publish collection records to other discipline specificrepositories where appropriate
I have some data ...If you have some data you would be interested in depositing please emailme - wed be interested. firstname.lastname@example.org(We can also talk about legacy format conversion if that helps)
These projects are supported by the Australian National Data Service (ANDS) ANDS is supported by the Australian Government through the NationalCollaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative