2. The Data Deluge
Microarray Data
Protein Structure Data
DNA Sequencing Data The University of Western Australia
3. The Data Deluge
“A single DNA sequencer can now
“our capacity to measure,
generate in a day what it took 10
years to collect for the Human
store, analyse and visualise
Genome Project. Computers are data is becoming the new
central to archiving and analysing
this information, but their processing reality to which research will
power isn’t increasing fast enough,
and their costs are decreasing too have to adapt.”
slowly, to keep up with the deluge.” – John Wilbanks (Creative Commons)
- Elizabeth Pennisi (Science Author)
“Data is more like soup “I worry there won't be
– it’s messy and you
enough people around to
don’t know what’s in it.” do the analysis.”
– Liz Lyon (UK DCC)
–Chris Ponting (University of Oxford UK, Computational
biologist)
The University of Western Australia
4. Why protect your data?
Research data is valuable. “Data is the
new oil”
– Andreas Weigend (Chief
It is important to ensure the Scientist, Amazon)
accessibility and security of datasets
which could be used as a baseline for
future research.
The University of Western Australia
5. Why protect your data?
Security
• Risk. Where is your data?
• Safeguards against data loss.
• Ensures confidentiality and
ethical compliance.
• Guarantees legal compliance to
intellectual property rights such
as copyright.
The University of Western Australia
8. Benefits of Research Data Management
Meets Compliance
Promotes Efficiency
Ensures Security
Allows Access
Improves Quality
The University of Western Australia
9. Benefits: Compliance
International Funding Agencies
Mandates ensure that publicly-funded research data is managed,
described, and stored in life-long preservation formats to aid in
discovery and reuse into the future by other researchers.
National Science
Foundation
www.nsf.gov
Data Management Plan Requirements Data Management Plan Requirements
“Proposals submitted or due on or after "All applicants submitting funding
January 18, 2011, must include a proposals to the MRC are required to
supplementary document of no more include a Data Management Plan as an
than two pages labelled “Data integral part of the application.“
Management Plan”. This supplementary
document should describe how the www.mrc.ac.uk/Ourresearch/Ethicsresearchguidance/datasha
proposal will conform to NSF policy on ring/DMPs/index.htm
the dissemination and sharing of
research results.”
www.nsf.gov/bfa/dias/policy/dmp.jsp
The University of Western Australia
10. Benefits: Compliance
Australian Funding Agencies
The Australian Research Council (ARC) and the National Health and Medical
Research Council (NHMRC) may soon require data management planning as
part of their grant application process.
Research Data Management prepares researchers for the expected future
changes in Australian funding agency requirements in relation to research data
management following overseas trends.
Sharing research data to
improve public health: full
Australian Code of National Principles joint statement by
Responsible of Intellectual funders of health research
Property www.wellcome.ac.uk/About-us/Policy/Spotlight-
Conduct of Management for
issues/Data-sharing/Public-health-and-
epidemiology/WTDV030690.htm
Research Publicly Funded The NHMRC signed this
www.nhmrc.gov.au/guid Research document which requires
elines/publications/r39 www.arc.gov.au/pdf/01_ that:
01.pdf
"standards of data
Developed jointly by the NHMRC, management are developed,
This document promotes data promoted and entrenched so
the ARC and Universities Australia. management & sharing of publicly that research data can be
funded IP outputs. shared routinely, and re-used
effectively."
The University of Western Australia
11. Benefits: Compliance
The Code
The Australian Code for the Responsible Conduct
of Research guides researchers and institutions in
responsible research practices.
Section 1: General principles of responsible
research
Section 2: Management of research data and
primary materials
– Proper management and retention of research
data.
– Researcher must decide which data to retain
– May be determined by law, funding agency,
publisher or a discipline’s convention.
The University of Western Australia
12. Benefits: Compliance
UWA Code of Conduct for the Responsible Practice of
Compliance with
Research
http://www.research.uwa.edu.au/staff/research-policy/guidelines
Section 2 refers to the management of research data and primary materials and states:
2.1 Data (including electronic data) must be recorded in a durable and appropriately
referenced form.
2.2 Data must be held for sufficient time to allow access and reference. Recommended a
minimum 5 years from date publication, but up to 15 years for specific types (eg clinical
studies)
2.3 Wherever possible, original data must be retained in the school or research centre in
which it was generated... In all cases, prior to the publication of research findings a
Location of Data Form must be completed.
These guidelines should be seen as a framework for sound research practice and for the
protection of individual research workers, including both staff and postgraduate research
students, from possible misunderstandings.
The University of Western Australia
13. Benefits: Compliance
Publishers
Meets the requirements of publishers who have data sharing policies which
require good data management strategies.
Cell
http://www.cell.c
om/authors
Science
http://www.science
mag.org/site/feature
/contribinfo/prep/ge
n_info.xhtml#dataav
“One of the
ail terms and
Nature Publishing Group conditions of
http://www.nature.com/authors/p
olicies/availability.html publishing in
Cell is
“... authors are required to “All data necessary to that authors be willing to
make materials, data and understand, assess, and extend distribute any materials and
associated protocols the conclusions of the protocols used in the published
promptly available to manuscript must be available to experiments to qualified
readers.” any reader of Science.” researchers for their own use."
The University of Western Australia
14. Benefits: Efficiency
• Improves research process.
• Encourages systematic
documentation and descriptions
• Provides guidelines and procedures
ensuring consistency.
Images: http://uclafacultyassociation.blogspot.com.au/2012/10/uc-officials-release-thousands-of.html; http://senior.ceng.metu.edu.tr/2012/viceversa/document.php;
The University of Western Australia
15. Benefits: Access
• Validation and verification
• Enables collaborative research
opportunities
• Prevents duplication of research
• Allows data sharing
• Increases citations
Image: http://coupeweb.ca/services.html; http://theaphidroom.wordpress.com/2012/06/07/data-and-specimens-sharing-opportunity-or-burden/;
The University of Western Australia
16. Benefits: Quality
• Allows for data replication
• Increases accuracy or reliability
• Ensures research data integrity.
Image: http://www.gentec-eo.com/products/calorimeters
The University of Western Australia
17. How to find the Research Data Management Toolkit at UWA
Research support services web page Research Data Management
http://www.is.uwa.edu.au/research web page links to the LibGuide
http://www.is.uwa.edu.au/research/research-
data-management-toolkit
The University of Western Australia
18. Research Data Management Toolkit
- LibGuide
1. Intellectual Property
2. Documentation and Metadata
3. Storage and Backup
4. Sharing and Reuse
5. Retention and Disposal
The University of Western Australia
19. This is supported by the Australian National Data Service
(ANDS)
ANDS is supported by the Australian Government through the
National Collaborative Research Infrastructure Strategy Program
and the Education Investment Fund (EIF) Super Science Initiative
The University of Western Australia
Editor's Notes
This training session will outline the expected roles of the librarian within the scope of Research Data management.I will be providing an introduction to RDM today and in the future we will be providing further, more detailed sessions on the various aspects of RDM.If you have any questions regarding what you see here, please feel free to ask me at the end of the session
Academic researchers are dealing with a massive amount of research data. With the advancements in digital technologies research capabilities have expanded in leaps and bounds And researchers can accrue huge amounts of data in short periods of time. Now great volumes of data can be created within minutes (what used to take months).Within the science field, examples include microarray data, protein structure data, and DNA sequencing data.
Here are some quotes relating to the data delugeUntil recently research outputs were primarily text (hardcopies or digital versions of the hardcopy). Researchers within academic institutions are creating a richresource of research data which has potential beyond the original scope of the project it was created for.
I love this quote byWeigend – Data is the new oilThat’s why we need to protect it and make it accessible and reusable – but how?Using correct data management strategies!
Researchers need to protect their data from destruction – as you can see this can occur in many ways…As a result -Storing data in the right place is vitalSo is storing it in retrievable formats!And file naming or labelling ensures legal complianceDoing this guarantees legal compliance to IP rights.
This diagram was developed by the eResearch Support team to demonstrate Where RDM Planning fits within the Research Lifecycle.It shows the steps in the research life cycle.Step 1 – where the researcher has the initial conceptStep 2 – is planning for the entire projectStep 3 – is where the researchers create their grant/project proposalStep 4 – start their projectStep 5 – data collectionStep 6 – Conclusion of the projectStep 7 – Reporting & publicationBefore starting a project, researchers should be encouraged to document their Research Data Management Plan in THIS SECTOR of the lifecycle (before project start-up)The RDM Plan is then implemented throughout each phase of the research lifecycle. The Plan covers issues such as (these) – which I will cover in a little more detail todayownership, documentation, security, sharing and disposal of research dataLets say that the researcher has confirmed the IP rights of their data in the planning stage,In the Reporting & Publication phase, they will have predetermined who needs to be acknowledged or cited.Another example could be – The researcher needs to work out whether the data can be shared or re-used in the planning stage. They would need to determine whether there are any ethical issues or confidentiality issues associated with the data. This would affect how they label their datasets in the data collection phaseThis would also affect how they store their datasets in the Project Conclusion phase.It will also effect how they report their findings in the reporting and publication phase and who has access.Documenting all of these issues allows Research data to be used, reviewed and modified via follow-up projects beyond the scope of the original research project.
The Australian National Data Service (ANDS) is creating the infrastructure for Australian researchers and research organisations to point to their research datasets in the Australian Research Data CommonsSo ANDS has the charter to build the ARDCThis publicises the existence of their research data collections. Making them discoverableGiving a competitive advantagePromoting collaborations and data sharing.Sharing research data from publicly funded research makes best use of the tax-payer’s dollar!In 2011 and 2012 UWA worked with the Australian National Data Service on a Seeding the Commons Project. ANDS funded and supported UWA in this project.Wemanually entered descriptions and metadata of UWA datasets in RDA.Other Australian Universities are doing the same thing.Research Data Australia (RDA) is a discovery service and catalogue for Australian research data collections.
Effective data management has many benefits.Itwill ensure the responsible conduct of research in several keys areas which are outlined here. DATA Compliance, Efficiency, Access, Quality and Security.I will briefly go through them one at a time now.
Read slide paragraphInternational funding agencies which require research data management planningincludeUS National Science Foundation (NSF)UK Medical Research Council (MRC)Bothwith theirData Management Plan RequirementsThe ARC and the NHMRC may soon require data management planning within the funding application process.And we should be ready to support them with this.
In the Australian arena – as I said in the previous slide – our funding agencies may soon ask for data mgt as part of their grant application processThese documents point towards this trend.The Code was jointly developed by the NHMRC, ARC AND Universities Australia;And more recently,In 2011 seventeen signatories including the NHMRC signed a joint statement which states:standards of data management are developed, promoted and entrenched so that research data can be shared routinely,and re-used effectively
You may be familiar with the Code;As I mentioned in the previous slide, it was jointly developed by the ARC, NHMC and Universities AustraliaIn order to receive NHMRC funding, researchers must comply with the code.What is the code?Provides guidelines for institutions and researchers;Promotes best practices in responsible research; andPromotes research integrity;Section 1: coversGeneral principles of responsible researchSection 2: covers the Management of research data and primary materials
The UWA Code of Conduct for the Responsible Practice of Research is based on the Aust. Code for the Resp Con ResIt is a guideline for best research practiceIt protects researchers, including both staff and postgraduate research students, from any misunderstandings (such as disputes of research claims)Section 2 refers to data formats, duration of retention and location of data
RDM ALSO meets the requirements of publishers who have data sharing policies which require good data management strategies.Here are three examplesNatureScience andCell
Now that’s the end of compliance as a benefit of RDM.. butRDM also promotes DataEfficiencyImproves management of data and the research process as a wholeEncourages systematic documentation and descriptions of the research data. Provides guidelines and procedures ensuring consistency of research
Access is another benefit of RDMAllows researchers to validate and verify published results.Enables collaborative research opportunities thereby increasing the potential scale and scope of research.Prevents duplication of research within a particular field.Allows data sharing and future use when the data is preserved in retrievable formats.Increases citations for the researcher.
Improved quality of research is another benefit of RDMAllows for data replication or reproducibility.Increases the accuracy or reliability of the data.Ensures research data integrity.
The RDM Toolkit is ideal as a reference tool or go-to guide for our librarians.It provides all the links related to research data mgt issuesEverything that I will go through today is available in the Toolkit for your quick reference.You can also direct the researchers themselves to the toolkit.The Toolkit covers the 5 main areas of RDM
Thanks you for listening and please feel free to ask questions or make any suggestions regarding this new area.