This document provides an introduction to research data management. It discusses what constitutes research data, the importance of managing data, and factors to consider such as documentation, metadata, data sharing and archiving. It also outlines the University of Oxford's policy on research data management and available support services to assist researchers in developing data management plans and ensuring the long-term preservation and sharing of research data.
Introduction to Research Data Management - 2014-01-27 - Social Sciences Division, University of Oxford
1. Introduction to research
data management
Slides provided by the Research Support
Team, IT Services, University of Oxford
2. WHAT IS RESEARCH DATA
MANAGEMENT?
Introduction to research
data management
3. What is data?
“A reinterpretable representation of information in a formalized
manner suitable for communication, interpretation, or processing.”
Digital Curation Centre
Slide adapted from
the PrePARe Project
Introduction to research
data management
4. What is data?
Any information you use in your
research
Slide adapted from
the PrePARe Project
Introduction to research
data management
5. What is data management?
How you organize, structure, store, and care
for the information used or generated during a
research project
It includes:
How you deal with information on a day-to-day
basis over the lifetime of a project
What happens to data in the longer term – what you
do with it after the project concludes
Introduction to research
data management
6. Why spend time and effort on this?
So you can work efficiently and
effectively
Because your data is precious
To enable data re-use and sharing
To meet funders‟ and institutional
requirements
Introduction to research
data management
7. University of Oxford policy
Introduced July 2012
Introduction to research
data management
8. University of Oxford policy
The full policy can be viewed on the University of
Oxford Research Data Management website
Research data is the information needed „to support or
validate a research project‟s observations, findings or
outputs‟
Research data should be:
Accurate, complete, identifiable, retrievable, and
securely stored
Able to be made available to others
Introduction to research
data management
9. University of Oxford policy
Research data should be retained for „as long as they
are of continuing value to the researcher and the wider
research community‟ – but a minimum of three years
Specific requirements from funders take precedence
Researchers are responsible for:
Planning for the ongoing custodianship of their data
Developing and documenting clear data management procedures
Ensuring that legal, ethical, and funding body requirements are met
Policy applies to University staff and doctoral students
Depositing relevant research data may ultimately become a condition
of award for doctorates
Introduction to research
data management
10. Funders‟ requirements
Funding bodies are taking an increasing
interest in what happens to research data
You may be required to make your data
publicly available at the end of a project
Check
the small print in your grant conditions
Many funders require a data management plan
as part of grant applications
Oxford‟s RDM website provides a summary of
requirements
Introduction to research
data management
11. Thinking ahead is vital
It‟s easy to think of long term data
management as something only relevant
to the end of a project
But many aspects of it
need planning from the
beginning
Introduction to research
data management
13. Documentation and metadata
Documentation is the contextual information
required to make data intelligible and aid
interpretation
A
users‟ guide to your data
Metadata is similar, but usually more structured
Conforms
Machine
to set standards
readable
Introduction to research
data management
14. Make material understandable
What‟s obvious
now might not
be in a few
months, years,
decades…
MAKE SURE
YOU CAN
UNDERSTAND
IT LATER
Adapted from „Clay Tablets with Linear B Script‟ by Dennis, via Flickr: http://www.flickr.com/photos/archer10/5692813531/
Slide adapted from
the PrePARe Project
Introduction to research
data management
15. Make material verifiable and reusable
Image by woodleywonderworks , via Flickr:
http://www.flickr.com/photos/wwworks/4588700881/
• Detailing methods helps
people understand what
you did
• And helps make your
work reproducible
• Provide context to
minimize the risk of
misunderstanding or
misuse
Slide adapted from
the PrePARe Project
Introduction to research
data management
16. Documentation – what to include
• Who created it, when and why
•
•
•
•
Description of the item
Methodology and methods
Units of measurement
Definitions of jargon,
acronyms and code
• References to related data
Slide adapted from
the PrePARe Project
Introduction to research
data management
17. Metadata – data about data
A formal,
structured
description
of a dataset
Used by
archives
to create
catalogue
records
Introduction to research
data management
18. Missing metadata – or the riddle of the
sixth toe
This painting shows
Georgiana, Duchess of
Devonshire as Diana
… or maybe Cynthia
She has six toes – but
no one knows why
Public domain image from Wikimedia Commons:
http://commons.wikimedia.org/wiki/File:Georgiana_Cavendish,_Duchess_of_Devonshire_as_Diana.jpg
Introduction to research
data management
19. For discussion
What data management challenges have you
encountered?
What strategies have you personally found
useful?
Be ready to feed back to the group
Introduction to research
data management
20. WHAT HAPPENS AT THE END
OF THE PROJECT?
Introduction to research
data management
21. Data archiving
The best way of ensuring long-term
preservation of your data is depositing it in an
archive or repository
DataBib provides a catalogue: http://databib.org/
A number of national disciplinary archives exist –
e.g. the UK Data Archive: http://www.dataarchive.ac.uk/
Oxford will soon have its own data archive
If possible, make it available for re-use
Introduction to research
data management
22. Why share data? Reputation
Get credit for high quality
research
Recognition for contribution
to research community
Open data leads to increased
citations
Of
the data itself
Of
associated papers
Slide adapted from
the PrePARe Project
Introduction to research
data management
23. Why share data? Reuse
Reduces duplication of
effort
Allows public research
funding to be used more
effectively
Contexts not currently
envisaged
Extend research beyond
your discipline
Slide adapted from
the PrePARe Project
Introduction to research
data management
24. Why share data? Be a trailblazer!
A paradigm shift in how research outputs are
viewed is occurring
Data outputs are of increasing importance –
and are likely to become even more so
Major journals are increasingly
looking to publish datasets
alongside articles
Be at the forefront of an
important shift in the
academic world
Introduction to research
data management
25. Video by NYU Health Sciences Libraries: http://www.youtube.com/watch?v=N2zK3sAtr-4
Introduction to research
data management
26. Data sharing – concerns
Ethical concerns
Confidential
Legal concerns
Third
or sensitive data
party data
Professional concerns
Intended
publication
Commercial
issues (e.g. patent protection)
Introduction to research
data management
27. Data sharing – concerns
• Redact or embargo if there is good reason
• Planning ahead can reduce difficulties
Slide adapted from
the PrePARe Project
Introduction to research
data management
28. Data licensing
A licence clarifies the conditions for accessing
and making use of a dataset
User
knows what‟s allowed without asking further
permission
Doesn‟t
exclude possibility of specific requests to
go beyond the terms of the licence
For databases, structure and content may be
covered by separate rights
Introduction to research
data management
29. Data licences - examples
Creative Common licences
Six different flavours, plus CC0 public domain dedication
Widely used and recognized
http://creativecommons.org/
Open Data Commons
Specifically designed for datasets
Recognizes the structure/content distinction
http://opendatacommons.org/
Introduction to research
data management
30. Data licensing - guidance
„How to License Research Data‟
A
guide from the Digital Curation Centre
http://www.dcc.ac.uk/resources/how-guides/license-research-data
Introduction to research
data management
32. Data management plans
A document which may be created in the early
stages of a project
While
An
planning, applying for funding, or setting up
initial plan may be expanded later
Details plans and expectations for data
Nature
of data and its creation or acquisition
Storage
and security
Preservation
and sharing
Introduction to research
data management
33. Exercise
Using the resources available, have a go at
drafting a data management plan for your own
research
If there are questions you can‟t answer at this
stage, make a note of
What
you need to find out
Decisions
you need to make
Introduction to research
data management
34. Digital Curation Centre
A national service
providing advice and
resources
Create a data
management plan
using the DMP online
tool
http://www.dcc.ac.uk/
https://dmponline.dcc.ac.uk/
Introduction to research
data management
35. „In preparing for
battle, I have always
found that plans are
useless but planning
is indispensable.‟
Dwight D. Eisenhower
Introduction to research
data management
37. DataBank and DataFinder
Two forthcoming University of Oxford services
Launch date TBC
Introduction to research
data management
38. DataBank
University of Oxford‟s institutional data archive
Long term preservation for datasets without another
natural home
Datasets will be assigned DOIs
Will work alongside ORA, the University archive for
research publications
In some cases, may a suitable home for DPhil data
Possible to link publications in ORA to datasets in DataBank
Depositors can opt to make datasets publicly available,
embargoed for a fixed period, or hidden
Introduction to research
data management
39. DataFinder
A catalogue of datasets
Will harvest metadata from DataBank and other
compatible data stores
Information on the nature, location, and availability of the data
So anything in DataBank will have a record in DataFinder
Researchers depositing data elsewhere strongly
encouraged to add a record to DataFinder
Should provide a substantial resource for researchers
seeking datasets for reuse
Introduction to research
data management
40. ORDS – Online Research Database
Service
Specifically designed for academic research data
Cloud-hosted and automatically backed up
Web interface makes collaboration straightforward
If desired, databases can easily be made public
Designed to permit easy archiving
Currently being used by a small group of test users –
will become more widely available
later in 2014
http://ords.ox.ac.uk/
Introduction to research
data management
41. IT Learning Programme
Over 200 different IT
courses
Covering software, skills,
and new technologies
http://www.oucs.ox.ac.uk/itlp/
ITLP Portfolio offers
course materials and
other resources
http://portfolio.it.ox.ac.uk/
Introduction to research
data management
42. IT Services: Data Back-up on the HFS
HFS is Oxford‟s central back-up and archiving
service
Free of charge to University staff and
postgraduates
Automated back-ups of machines connected to
University network
Copies kept in multiple places
Introduction to research
data management
43. IT Services: Research Support Team
Can assist with technical aspects of research
projects at all stages of the project lifecycle
But
the earlier you seek advice, the better
For more information, see our website:
http://blogs.it.ox.ac.uk/acit-rs-team/about/
Or email us on
researchsupport@it.ox.ac.uk
Introduction to research
data management
45. Research data management website
Oxford‟s central
advisory website
Covers data
management
planning, back-up and
security, data sharing
and archiving, funder
requirements, etc.
University policy is
available
http://www.admin.ox.ac.uk/rdm/
Introduction to research
data management
46. Research Skills Toolkit
Website and handson workshops
A guide to software,
University services,
and other tools and
resources for
research
Requires SSO login
http://www.skillstoolkit.ox.ac.uk/
Introduction to research
data management
47. Research Data MANTRA
Free online
interactive
training modules
Aimed at
postgraduates
and early career
researchers
http://datalib.edina.ac.uk/mantra/
Introduction to research
data management
48. Any questions?
Ask now, or email us on
researchsupport@it.ox.ac.uk
Introduction to research
data management
49. Rights and re-use
This slideshow was developed by the IT Services Research
Support Team, University of Oxford, based on a presentation
originally prepared as part of the DaMaRO Project
With the exception of clip art used with permission from
Microsoft, the slideshow is made available under a Creative
Commons Attribution Non-Commercial Share-Alike License
Parts of this slideshow draw on teaching materials produced by
the PrePARe Project, DATUM for Health, and DataTrain
Archaeology
Within the terms of this licence, we actively encourage sharing,
adaptation, and re-use of this material
Introduction to research
data management
Editor's Notes
The first question to address is what the term ‘data’ actually refers to. Definitions vary, and to some extent, what counts as data will depend on the field of study. For many people, their initial association with the word ‘data’ will be numerical information (statistics, spreadsheets, or experimental results, for example), or perhaps the contents of highly structured information sources such as relational databases.However, data is far from being limited to these. Other examples include:Textual sources (literary or historical works that are being analysed, or interview transcripts)Websites (including all sorts of sites such as social media sites, as well as established academic sources)Works of art and other imagesAudio files (e.g. oral history, recordings of interviews or focus groups)VideosEmailsComputer source codeBooksPapersCatalogues, concordances and indexes The Digital Curation Centre suggests that data is “A reinterpretable representation of information in a formalized manner suitable for communication, interpretation, or processing.”Image montage adapted from PrePARe Project slideshow ‘What is data?”: http://www.lib.cam.ac.uk/dataman/training.html
A very broad definition – such as ‘any information you use in your research’ – works well for thinking about data management: it helps make sure you don’t miss out something important!Whatever your area of research is, you will be dealing with data in one form or another. Bear in mind that not all data is digital: print resources, handwritten notes, tape recordings, and hard copies of images may also be important sources.In addition to the data you collect or generate and analyse as part of a research project, it’s also worth thinking about the data you will create. This might include very structured collections of information, such as a relational database – or it might be something much more informal, such as a file of your own notes, summaries you create for your own reference, or a list of items to be examined.Image montage adapted from PrePARe Project slideshow ‘What is data?”: http://www.lib.cam.ac.uk/dataman/training.html
Data management is a broad term covering a lot of aspects of the research process.It has two main strands – how you deal with information on a day-to-day basis during the active phase of a research project, and what happens to it in the longer term. Both of these are important, but because of limited time, we’ll deal mostly with thinking about the second today.
Most of us find that we have many calls on our time, and that packing everything that needs to be done into the week is often a challenge. That being the case, it’s easy to feel as though research data management is simply one more thing to add to an already endless to-do list – or worse, that it’s a distraction from real work. However, there are a number of key reasons that it’s worth paying some attention to it.Good data management does require an investment of effort – but ultimately it’s something that can actually save you time, by helping you work more efficiently. You want to complete your research project to the best of your ability, but with minimum stress – and good research data management is one of the tools that can help you to do that.Many of us are all too well acquainted with the frustration of trying to track down a fact or a document we know we have somewhere. Good research data management – setting up an organizational system that works for you, and ensuring everything is properly filed or labelled to enable re-identification and retrieval – can make life a lot easier. And it’s not just a matter of saving time and reducing unnecessary effort (though clearly that’s a major benefit): having everything well ordered can also help you get a better feel of the shape and scope of your research material, which in turn can enable you to spot patterns or connections that might otherwise get missed.It’s also well worth doing, because the data you’re producing or working with is valuableAs well as this being true for your own research, the data might ultimately be of use to other researchers. Having everything well organized and properly labelled also has the potential to save you a lot of time at the end of a research project, when it comes to deciding what to do with your data – but more of that later.Finally, there may be requirements imposed by your funding body and/or the university which you need to meet(The rest of the presentation fleshes out the points on this slide.)
As of summer 2012, the University of Oxford has an official policy on the management of research data and records
Note that the policy uses a specific definition of research data as the information that supports or validates research outputs. The policy only applies explicitly to data in this category – however, it’s still well worth thinking about the management of data construed more broadly, both from the perspective of making life easier for yourself, and because you may produce data that isn’t needed to back up an output from this particular project, but which nevertheless might be of use if shared with other researchers.The policy outlines two broad types of responsibility that researchers haveThe first of these is about data integrity – data should be correct and well storedThe second is about data sharing – as far as is reasonably possible, data should be made available for other to use
It’s easy to put off thinking about long term data management – or even to regard it as something you don’t need to worry about until after a project concludes. However, doing that can often lead to problems - many aspects of data management need planning from the beginning of a project. More about the planning process later.Image credit: Microsoft clip art
First of all, because documentation should be thorough it will contain a lot of information that might seem obvious. But will that same information still be obvious in a few months, years, decades, centuries… time?It’s often tempting to assume that you will remember everything, but in fact it’s all too easy to forget crucial information. It also means that other people can understand what you’ve done and why. It’s important to include context (why you did your research, how it fits into other contemporary research, or follows on from previous work), as well as explaining your methods and analytical techniques. This is related to the next point…Slide adapted from PrePARe Project slideshow “Explain It”: http://www.lib.cam.ac.uk/dataman/training.html
By providing documentation, you can provide the methodology of how you generated, collected or produced your data (for example information about collection strategies, interview methods, survey techniques, algorithms, database searches), and how you reached your conclusions from your data (for example any statistical methods you used). This isn’t purely for the benefit of other researchers – it may also be useful for you if you need to replicate or adapt or re-purpose an aspect of your research method later on.But it does also mean that people can reproduce your research, either to verify your conclusions or as a starting point to develop your work further. In many research groups, this could be a student or post-doc who continues work started by a previous group member. Replicating methodology can also be a useful training tool.It’s important to ensure all the relevant contextual information is provided to reduce the risk of data being misinterpreted.Additionally, producing good metadata means that it’s easier to find your data, as it highlights the important aspects in a machine-readable way. This makes computer-based searches, whether on your searching your own hard drive or looking for something on a database online, work better for you – they’re more likely to find relevant files and information more quickly.Slide adapted from PrePARe Project slideshow “Explain It”: http://www.lib.cam.ac.uk/dataman/training.html
Slide adapted from PrePARe Project slideshow “Explain It”: http://www.lib.cam.ac.uk/dataman/training.html
Metadata is a specific type of documentation – a formal description of a dataset which conforms to a particular structure. One typical use of metadata is to create a catalogue record for a dataset held in an archive.The image shows metadata for a dataset from an anthropology project. It follows the Dublin Core metadata standard – a straightforward, widely-used structure which is not tied to any specific discipline. The metadata (in blue in this image) is enclosed in tags, much like HTML. This makes the metadata machine readable – by using a standard set of tags, an automatic system can tell where the information about the title, creator, description and so forth begin and end.Keeping proper records during a research project will make it easy to provide metadata when this is needed.As well as being a requirement for depositing data in many archives, there are benefits to providing clear, comprehensive metadata. For example, it makes it much easier for other researchers to find your data. In turn, that means they’re more likely to reuse it, and give you credit.
This 18th centurypainting by Maria Cosway is part of a collection on display at Chatsworth House in Derbyshire. The subject is Georgiana Cavendish, Duchess of Devonshire (portrayed by Keira Knightley in the 2008 film The Duchess).It shows her as Diana, the goddess of the moon. Some sources, however, say she’s depicted as Cynthia from Spenser’s Faerie Queene. (At time of writing, the Wikimedia Commons metadata is itself inconsistent: the image title says she’s Diana, but the image description says she’s Cynthia.) In fact, Diana and Cynthia are different names for the same figure, so this isn’t as much of a contradiction as it might appear. However, there’s plenty of potential for confusion here!If you look closely, you can see that Georgiana has six toes. There are various theories about why this is: perhaps she really did have six toes (though there’s a lack of other evidence to support this), perhaps it’s an artistic shorthand hinting that the subject had supernatural abilities or a sixth sense, or perhaps the artist simply couldn’t count! However, no one really knows why: there’s no surviving record of the artist’s intention in giving her subject this unusual feature.A symbolic message, or just a mistake? Without the relevant metadata, we’ll never know.Image credit: Wikimedia Commons: http://commons.wikimedia.org/wiki/File:Georgiana_Cavendish,_Duchess_of_Devonshire_as_Diana.jpg
Discussion exercise for small groups, or for people to chat about during the coffee breakThe length of this exercise can be varied depending on the time available. If time permits, it may be useful to ask the small groups to feed back to the group as a whole, and in particular to encourage sharing of hints, tips, and solutions to specific problems.
As you’ve probably put a lot of effort into creating data in the course of your research, it’s worth thinking about how that data can be preserved for the long term after your project ends. As mentioned previously, many funders now require this. The best way to do this is to deposit it in an archive or repository. There may be an appropriate archive devoted to data in your discipline. For the social sciences, a major repository is the UK Data Archive, based at the University of Essex. Their website offers a host of useful information about best practice for data creation, storage, and sharing.For datasets that don’t have another natural home, Oxford will soon have its own data archive – more of that laterIdeally, data should be made available for others to re-use
Sharing data can build your reputation in number of ways. Laying your work open to scrutiny means that you will get credit for high quality research, increased understanding of your methods and allowing your work to be verified by others. Sharing allows you to make a greater contribution to your community – and to be recognized for doing so. It can also help extend your reputation beyond that community.There is also substantial evidence that making your data openly available leads to increased citations – of the datasets themselves, and of the papers or other publications based on the data.Slide adapted from PrePARe Project slideshow “Share It”: http://www.lib.cam.ac.uk/dataman/training.html
Sharing your research allows it to be re-used; this might be within your field, for example using the data as a starting point for a complementary study, or as test data for new software and algorithms. It might be useful for teaching purposes. Sharing data means that someone else working in a similar area doesn’t have to waste time duplicating the work you’ve already done. If datasets can be used in multiple research projects, that means the funding that allowed them to be created is being used more effectively – a key reason that many funding bodies are now requiring that data be shared where possible.Data might even be re-used in contexts that can’t currently be envisaged – for example in new developments several years down the line, or in completely different fields. And you will get credit as your work will be cited each time.Slide adapted from PrePARe Project slideshow “Share It”: http://www.lib.cam.ac.uk/dataman/training.html
A major change is happening within academia at the moment. Data outputs are being viewed as increasingly important, and this trend is only likely to continue - for example, major journals are increasingly looking to publish (or provide access to) datasets alongside the articles reporting on and interpreting the data.This provides an exciting opportunity for researchers: a chance to be at the forefront of a new movement. It’s well worth embracing this change – if you start getting your data out there in the public sphere now, then you’ll have a headstart.Image credit: Microsoft clip art
Link to video from http://www.youtube.com/watch?v=N2zK3sAtr-4. A data management horror story by Karen Hanson, Alisa Surkis and Karen Yacobucci, of NYU Health Sciences Libraries.
In some cases, there may be concerns about sharing data, or reasons why all or part of a dataset needs to be kept private. These may be ethical (the data is confidential), legal (the dataset includes third party material with restrictions on usage), or professional (you intend to publish the results, and don’t want someone to get there first).Image credit: Microsoft clip art
You can also redact material, for example 3rd party copyrighted material in a PhD thesis, or place embargoes so that it cannot be accessed for a certain period, for example because of publisher requirements or applying for a patent. Such measures may also be necessary with some confidential information.It’s worth noting that many difficulties or concerns about sharing data can be alleviated by advance planning. For example, ensuring you get proper permissions when data is collected can reduce problems with sharing personal data. If your dataset is a combination of third party data and new material, you may need to have a version of the data where these are kept separate. Proper documentation is also important here: this will help keep track of what you’re allowed to do with data, and what’s happened to it in the course of the project.Slide adapted from PrePARe Project slideshow “Share It”: http://www.lib.cam.ac.uk/dataman/training.html
A data management plan is, as the name suggests, a document which outlines how data will be managed over the course of a project.One may be created when a project is still in the initial planning stages, as part of a funding application (this may be a requirement), or when the project is in the process of getting underwayIt’s common for there to be more than one version of a plan: an initial outline might be produced for the funding application, then fleshed out if the application is successfulThe plan gives details of what sort of data the project expects to be dealing with, and what will be done with it. This might include:A description of the type of data that will be used and where it will come from – how it will be created, or where it will be obtained from if pre-existing datasets are being usedHow the data will be stored and kept safe during the projectWhat plans there are for preserving the data after the end of the project, and for sharing it with other researchers
Practical exercise which can last a flexible amount of time. The resources available will include David Shotton’s ‘Twenty Questions for Research Data Management’, the DCC’s checklist leaflet, and a very basic data management plan template based on one developed by DataTrain. Participants can make use of whichever of these they find most helpful.If it seems appropriate, this may be followed by a brief discussion session, in which participants are invited to give feedback on their experience of trying to draft a data management plan.
The Digital Curation Centre is a national service providing advice and resources to researchers and their institutions. Although their primary focus is (as their name suggests) on longer-term curation and preservation of research data, they offer information relating to the whole data lifecycle.One particularly helpful resource is their online data management planning tool. When building a plan, you can select a template which reflects the requirements of your particular funding body.
A final thought on the subject of plans and planning.A research project isn’t – or shouldn’t be – a battle, but President Eisenhower’s words nevertheless have some relevance in this context. It is almost inevitable that unexpected events will arise – it’s very rare that everything goes exactly as anticipated. But although this means you may often have to adapt your plan on the fly, this makes having created a plan in the first place more essential, not less. If you’ve thought through all the relevant issues, you’re less likely to be taken by surprise – and you’ll be better placed to respond when the unexpected does crop up.Public domain image, from http://commons.wikimedia.org/wiki/File:Dwight_D._Eisenhower,_official_Presidential_portrait.jpg
DataBank and DataFinder are two forthcoming University of Oxford services. They will be key parts of a larger research data management infrastructure that the University is in the process of developing. These services are being offered in part to enable researchers to comply with funder requirements and the demands of the new University policy.The launch date of these services is still to be determined: at the moment the plans are being reviewed by the relevant University committees.(These screenshots are taken from the development versions – the final versions will look slightly different. It’s also possible the names of the services will change.)
DataBank will be the University of Oxford’s institutional data archiveIt is intended to provide a long-term preservation option for datasets without another natural home – where, for example, no suitable national or discipline-based repository is available.Once depositing DPhil data becomes a condition of award for the degree, DataBank may be a suitable place for some Dphil data to be deposited.DOIs (Digital Object Identifiers) can be assigned to datasets deposited in DataBank. A DOI is a unique, permanent identifier for an electronic object such as a document, Web page, or dataset – it can be set to point to wherever the object is currently hosted. This means a DOI can be used to refer to the dataset in publications and so forth, and as long as the DOI metadata is kept updated, it will always send the reader to the right place. (This is preferable to using a URL, as these frequently change.)DataBank will operate in parallel with ORA, which is the University’s archive for research publications. It will be possible to create a link between a publication in ORA and the underlying dataset in DataBankResearchers depositing datasets in DataBank will have control over the availability of their data. They may choose to make a dataset publicly available, or to embargo it for a fixed period (so, for example, the data might become available a year or three years after being placed in DataBank). Sensitive data may be kept hidden permanently; in this case the data owner may choose either to make a record for the data available (so others can see that it exists, and perhaps contact the data owner to ask questions about it), or to make both data and record invisible.
DataFinder is a catalogue of datasets held by the University of Oxford and elsewhereDataFinder records will provide information about the nature of the dataset, where it is hosted, and (if details are given by the source) the availability of the data. Records for non-digital data can also be created in DataFinder: in this case, the record will include a description of the data and contact details for the data holder.DataFinder will harvest metadata about datasets from DataBank, and from other repositories or data stores that make their metadata available in a suitable form. These include ORDS, the database service mentioned earlier.This means that if a datasets is deposited in DataBank, a record for it will automatically be created in DataFinder (unless, of course, the DataBank record is set to be invisible).It will also be possible to add records to DataFinder manually, and researchers depositing data elsewhere are strongly encouraged to do this. The aim is for DataFinder to include a comprehensive listing of datasets created or owned by members of the University of Oxford.Once populated, DataFinder will be a substantial resource for researchers who want to find datasets they might be able to reuse in their own research, or who are looking for information about research that has already been conducted.
A new University service which will become available later this year is ORDS – the Online Research Database Service. It’s designed to allow academic researchers to create relational databases – so it’s a tool that might be used as an alternative to something like Microsoft Access or FileMaker Pro.The service uses cloud storage – so rather than your database being stored on your own computer, it’s hosted on a server, and you access it via a Web interface. This means you can access it from any computer with Internet access, and also has the advantage of meaning back up is taken care of automatically, without you needing to worry about itThe system is also set up to make collaboration – with people both in and outside Oxford – easy. All members of a project team can access the same version of the database, so there are no worries about whether you’re working with the latest version.If they wish to do so, the service will also allow users to make their databases publicly available. This might happen at the end of a project – or you might want to publish a specific sub-set of the data to accompany a research publication.For the longer term, if ORDS isn’t the most appropriate long-term home for your data, the system will be set up to allow easy transfer to the University’s new data archive (DataBank – more of that later) or elsewhere.ORDS will ultimately be a paid-for service. The service is currently being tested by a group of early adopters, but will become more widely available later in 2014. If you’d be interested in finding out more, please visit the ORDS website.
The IT Learning Programme offers an extensive range of IT coursesThese cover learning how to use specific pieces of software, IT-related skills (such as database design or programming), and how to make use of new technologies (such as social media or podcasting)The ITLP Portfolio website offers the course materials which you can use for self-study, and access to a range of other related resources
Oxford has a central back-up and archiving service called HFS, provided via IT Services. (You may also sometimes hear people refer to this as TSM – this is the name of the client software used to run back-ups.)The service is free to University staff and postgraduates.You can set up the system to perform automated back-ups of computers connected to the University network (these usually happen overnight). If that’s not convenient, you can run a manual back-up. (If you’ve had trouble with automated back-ups, contact the HFS team and they should be able to help.)Three copies of your data will be made. One of these is stored outside Oxford, so even if there were to be a flood or a fire at IT Services, your data would still be safe.HFS is the gold standard for back-up, and it’s a good idea to use it if you can. But the really important thing is that you have a good back-up routine of some description – multiple copies, regularly updated, and stored securely in multiple places.
IT Services has a team of people who provide support to researchers. They can assist with various aspects of the technical side of a research project throughout the project lifecycle – planning, setting up, doing the work, and what happens at the end of the project. If you need some help setting up a database, building a website, or working out where and how to store your data, the Research Support Team may be able to help.The earlier in the research process you seek advice, the better – preferably while things are still in the planning stages.You can find more information on the team’s website, http://blogs.it.ox.ac.uk/acit-rs-team/about/, or by emailing researchsupport@it.ox.ac.uk
The University of Oxford has a central Research Data Management website, which provides a central information source on this subject. A copy of the University Policy on the Management of Research Data and Records can be downloaded from here.At time of writing, the website is being redesigned – the new version should be launched shortly.
The Research Skills Toolkit website provides an overview of lots of useful software and services, plus other tools and resources for researchers. It includes a substantial section on managing information. The Toolkit team also holds a series of hands-on workshops each year.The site provides a guide to software, tools, University services, and other things that are useful to know about. There’s a substantial section on information management.The site is hosted on WebLearn, and you’ll need to log in using your SSO credentials – the same username and password you use for Nexus email.
Research Data MANTRA is a series of free interactive online training modules covering key research data management issues.The modules are designed for postgraduates and early career researchers. The course describes itself as being particularly geared towards people working in geosciences, social and political sciences, and clinical psychology, but don’t be put off by this – in fact much of the course material is relevant to all research disciplines.