A presentation given as part of the DC101 training course run by the DCC at Oxford University in June 2010. The course provided data management guidance for researchers.
9. Funders’ data policies http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
Editor's Notes
Given the audience I’ll reflect on two pieces of DCC work: DAF tool, which has been used primarily by service providers or intermediaries to investigate what’s happening in terms of data management at the coalface and explore service gaps to see what support researchers need, and; Research funders policies, specifically in terms of data management and sharing plan requirements, as this is directly relevant to researchers
DAF established in response to a recommendation in the Dealing with Data report. The framework aimed to do two things: To help HEIs find out what data were held by research groups / departments etc, because until you know what you have you can’t begin to manage it; To explore what was happening with those data – were people aware of policies or best practice for data management, what was practical to achieve? DAF has been used in various settings – at research group level, depts. schools – and at different institutions including Oxford so I’ll touch on that example.
DAF was used at Oxford as part of the ‘scoping digital repository services project’ which was essentially looking at what repositories need to provide if they are to store, disseminate and preserve research data. Surveys have been undertaken in medical, physical science and humanities research groups.
The interview-based approach engaged researchers in discussion about the lifecycle of their data from funding through data collection, management and publishing. Specific emphasis was placed on service gaps and requirements researchers had for support on managing data.
They visualized the findings by mapping them against the DCC lifecycle model. Key findings for CMFEG were: High resolution images of the heart were created – store = c.1.5TB per heart Data sharing with other research groups e.g. computational biology group, difficult because of volumes. Server needs to be physically transported and data copied. Better infrastructure was needed. No national facility to help with long-term preservation and sharing but BBSRC require data are kept for 10 years…
Key findings for Young Lives were: Five rounds of data collection, one every 3 years Changing environments – round 1 data posted data on disk, round 2 transferred via web-server, round 3 using PDA for collection and upload ESDS have provided guidance and there are well documented procedures for standardising data collection, quality control, analysis etc
The DCC has looked at funders’ policies for data curation and compared these in a table, which is available online Several expect researchers to consider data management and sharing at the application stage, so we’ll look at these requirements in more detail.
The AHRC and ESRC have a section of the Je-S form set aside for data management. These are the six sections the AHRC expects a response on.
These are the five questions ESRC ask. Both AHRC and ESRC ask about archiving and sustainability, while the biomedical funders seem to place more emphasis on data sharing.
BBSRC, MRC and Wellcome all ask for a short statement / plan to be produced and suggest themes that may be applicable, rather than asking specific questions. Focus here is on data sharing – secondary use, methods for sharing, timeframes for release…
MRC provide quite a few points for consideration covering all aspects: Data to be created (added value it will bring) Future use of data (plans / timeframes to facilitate this) Arrangements for data management (in project) and preservation (post-project)
Wellcome Trust similarly covers all bases of creating data (managing IPR / research participants) and depositing for long-term preservation
These seem to be the five main questions asked across the board First link takes you to a document that provides a comparison of what each funder asks for and the DCC link is to our guidance on data planning. We’re also providing an online tool to help in the formulation of data management and sharing plans.
Planning for data creation is the ‘conceptualisation’ stage of the DCC lifecycle so notes from those earlier slides will help too Decisions made at this stage have an impact on what can happen later on so it is worth planning to get things right from the start. Think about what you want people to be able to do later to get the right consent agreements, chose right formats, rights management etc… Various support is available through Oxford and elsewhere.
Lots of things to consider in terms of creating / collecting data. How can you develop processes / procedure so there’s a standard approach and quality? Make sure data are secure and well-managed – plans for storage etc Speak to OUCS. They offer support to research projects.
Think of all the different types of information users (and you!) will need to understand the data in the future. If these aren’t captured at the time it’s very hard to do. UKDA provides really useful advice on documentation, which is relevant to all areas, not just social sciences.
Think about how best to move data around. Can you get support from OUCS or use infrastructure like SharePoint at Oxford which is in development. Data sharing is often expected, so how will you manage this. There’s a public health conference on data sharing in September that may be relevant to many of you.
Do you know what’s required for the long-term and what support you can draw on? Various subject-specific data centres are available, as well as local support at Oxford through ORA and OUCS.
For the exercise we want to think a bit more about the local context at Oxford. Break into subject-based groups, with one or two support staff in each and discuss the three questions listed.