Research Data
Management
PROTECTING YOUR ASSETS

Katina Toufexis
Research Data Librarian
eResearch Support and Digital Developments Unit
The Data Deluge




   Microarray Data



                         Protein Structure Data




                                                      DNA Sequencing Data   The University of Western Australia
The Data Deluge
  “A single DNA sequencer can now
                                            “our capacity to measure,
   generate in a day what it took 10
    years to collect for the Human
                                           store, analyse and visualise
   Genome Project. Computers are            data is becoming the new
  central to archiving and analysing
this information, but their processing     reality to which research will
 power isn’t increasing fast enough,
 and their costs are decreasing too                have to adapt.”
 slowly, to keep up with the deluge.”              – John Wilbanks (Creative Commons)
    - Elizabeth Pennisi (Science Author)




“Data is more like soup                     “I worry there won't be
 – it’s messy and you
                                           enough people around to
don’t know what’s in it.”                       do the analysis.”
           – Liz Lyon (UK DCC)
                                           –Chris Ponting (University of Oxford UK, Computational
                                                                 biologist)


                                                                        The University of Western Australia
Why protect your data?

Research data is valuable.               “Data is the
                                           new oil”
                                         – Andreas Weigend (Chief

It is important to ensure the               Scientist, Amazon)


accessibility and security of datasets
which could be used as a baseline for
future research.



                                         The University of Western Australia
Why protect your data?
Security
    • Risk. Where is your data?

    •   Safeguards against data loss.

    •   Ensures confidentiality and
        ethical compliance.

    •   Guarantees legal compliance to
        intellectual property rights such
        as copyright.


                                            The University of Western Australia
Research Lifecycle




                     The University of Western Australia
Making datasets
discoverable




       ARDC
                  The University of Western Australia
Benefits of Research Data Management

 Meets Compliance
 Promotes Efficiency
 Ensures Security
 Allows Access
 Improves Quality


                                       The University of Western Australia
Benefits: Compliance
                           International Funding Agencies
Mandates ensure that publicly-funded research data is managed,
described, and stored in life-long preservation formats to aid in
discovery and reuse into the future by other researchers.

               National Science
     Foundation
                 www.nsf.gov



      Data Management Plan Requirements            Data Management Plan Requirements

     “Proposals submitted or due on or after          "All applicants submitting funding
        January 18, 2011, must include a            proposals to the MRC are required to
      supplementary document of no more            include a Data Management Plan as an
          than two pages labelled “Data                integral part of the application.“
     Management Plan”. This supplementary
       document should describe how the           www.mrc.ac.uk/Ourresearch/Ethicsresearchguidance/datasha
      proposal will conform to NSF policy on                       ring/DMPs/index.htm
         the dissemination and sharing of
                 research results.”
            www.nsf.gov/bfa/dias/policy/dmp.jsp

                                                                                   The University of Western Australia
Benefits: Compliance
                               Australian Funding Agencies
The Australian Research Council (ARC) and the National Health and Medical
Research Council (NHMRC) may soon require data management planning as
part of their grant application process.
Research Data Management prepares researchers for the expected future
changes in Australian funding agency requirements in relation to research data
management following overseas trends.
                                                                               Sharing research data to
                                                                              improve public health: full
               Australian Code of                     National Principles        joint statement by
                 Responsible                            of Intellectual       funders of health research
                                                           Property            www.wellcome.ac.uk/About-us/Policy/Spotlight-
                  Conduct of                           Management for
                                                                                 issues/Data-sharing/Public-health-and-
                                                                                    epidemiology/WTDV030690.htm

                   Research                            Publicly Funded            The NHMRC signed this
               www.nhmrc.gov.au/guid                       Research              document which requires
                elines/publications/r39              www.arc.gov.au/pdf/01_                 that:
                                                             01.pdf
                                                                                    "standards of data
   Developed jointly by the NHMRC,                                             management are developed,
                                           This document promotes data         promoted and entrenched so
  the ARC and Universities Australia.     management & sharing of publicly       that research data can be
                                                 funded IP outputs.            shared routinely, and re-used
                                                                                        effectively."

                                                                                 The University of Western Australia
Benefits: Compliance
The Code
The Australian Code for the Responsible Conduct
of Research guides researchers and institutions in
responsible research practices.

Section 1: General principles of responsible
   research
Section 2: Management of research data and
   primary materials
              – Proper management and retention of research
                data.
              – Researcher must decide which data to retain
              – May be determined by law, funding agency,
                publisher or a discipline’s convention.


                                                              The University of Western Australia
Benefits: Compliance

        UWA Code of Conduct for the Responsible Practice of
                        Compliance with
                           Research
                            http://www.research.uwa.edu.au/staff/research-policy/guidelines

  Section 2 refers to the management of research data and primary materials and states:
          2.1 Data (including electronic data) must be recorded in a durable and appropriately
          referenced form.

          2.2 Data must be held for sufficient time to allow access and reference. Recommended a
          minimum 5 years from date publication, but up to 15 years for specific types (eg clinical
          studies)

          2.3 Wherever possible, original data must be retained in the school or research centre in
          which it was generated... In all cases, prior to the publication of research findings a
          Location of Data Form must be completed.
   These guidelines should be seen as a framework for sound research practice and for the
   protection of individual research workers, including both staff and postgraduate research
   students, from possible misunderstandings.


                                                                                              The University of Western Australia
Benefits: Compliance
Publishers
Meets the requirements of publishers who have data sharing policies which
require good data management strategies.
                                                                                                         Cell
                                                                                                   http://www.cell.c
                                                                                                      om/authors
                                                       Science
                                                    http://www.science
                                                   mag.org/site/feature
                                                   /contribinfo/prep/ge
                                                   n_info.xhtml#dataav
                                                                                              “One of the
                                                             ail                               terms and
 Nature Publishing Group                                                                     conditions of
 http://www.nature.com/authors/p
       olicies/availability.html                                                             publishing in
                                                                                                 Cell is
   “... authors are required to        “All data necessary to                that authors be willing to
    make materials, data and       understand, assess, and extend           distribute any materials and
        associated protocols           the conclusions of the             protocols used in the published
       promptly available to       manuscript must be available to            experiments to qualified
             readers.”                any reader of Science.”             researchers for their own use."

                                                                                The University of Western Australia
Benefits: Efficiency

                   •         Improves research process.

                   •         Encourages systematic
                             documentation and descriptions

                   •         Provides guidelines and procedures
                             ensuring consistency.




Images: http://uclafacultyassociation.blogspot.com.au/2012/10/uc-officials-release-thousands-of.html; http://senior.ceng.metu.edu.tr/2012/viceversa/document.php;



                                                                                                                                                                    The University of Western Australia
Benefits: Access
                  • Validation and verification

                  • Enables collaborative research
                    opportunities

                  • Prevents duplication of research

                  • Allows data sharing

                  • Increases citations
Image: http://coupeweb.ca/services.html; http://theaphidroom.wordpress.com/2012/06/07/data-and-specimens-sharing-opportunity-or-burden/;



                                                                                                                                           The University of Western Australia
Benefits: Quality


                  • Allows for data replication

                  • Increases accuracy or reliability

                  • Ensures research data integrity.




Image: http://www.gentec-eo.com/products/calorimeters



                                                        The University of Western Australia
How to find the Research Data Management Toolkit at UWA




Research support services web page   Research Data Management
http://www.is.uwa.edu.au/research    web page links to the LibGuide
                                     http://www.is.uwa.edu.au/research/research-
                                     data-management-toolkit
                                                                                   The University of Western Australia
Research Data Management Toolkit
- LibGuide


 1.   Intellectual Property
 2.   Documentation and Metadata
 3.   Storage and Backup
 4.   Sharing and Reuse
 5.   Retention and Disposal


                                   The University of Western Australia
This is supported by the Australian National Data Service
                             (ANDS)




  ANDS is supported by the Australian Government through the
National Collaborative Research Infrastructure Strategy Program
and the Education Investment Fund (EIF) Super Science Initiative




                                                 The University of Western Australia

Introduction to Research Data Management at UWA

  • 1.
    Research Data Management PROTECTING YOURASSETS Katina Toufexis Research Data Librarian eResearch Support and Digital Developments Unit
  • 2.
    The Data Deluge  Microarray Data  Protein Structure Data  DNA Sequencing Data The University of Western Australia
  • 3.
    The Data Deluge “A single DNA sequencer can now “our capacity to measure, generate in a day what it took 10 years to collect for the Human store, analyse and visualise Genome Project. Computers are data is becoming the new central to archiving and analysing this information, but their processing reality to which research will power isn’t increasing fast enough, and their costs are decreasing too have to adapt.” slowly, to keep up with the deluge.” – John Wilbanks (Creative Commons) - Elizabeth Pennisi (Science Author) “Data is more like soup “I worry there won't be – it’s messy and you enough people around to don’t know what’s in it.” do the analysis.” – Liz Lyon (UK DCC) –Chris Ponting (University of Oxford UK, Computational biologist) The University of Western Australia
  • 4.
    Why protect yourdata? Research data is valuable. “Data is the new oil” – Andreas Weigend (Chief It is important to ensure the Scientist, Amazon) accessibility and security of datasets which could be used as a baseline for future research. The University of Western Australia
  • 5.
    Why protect yourdata? Security • Risk. Where is your data? • Safeguards against data loss. • Ensures confidentiality and ethical compliance. • Guarantees legal compliance to intellectual property rights such as copyright. The University of Western Australia
  • 6.
    Research Lifecycle The University of Western Australia
  • 7.
    Making datasets discoverable ARDC The University of Western Australia
  • 8.
    Benefits of ResearchData Management  Meets Compliance  Promotes Efficiency  Ensures Security  Allows Access  Improves Quality The University of Western Australia
  • 9.
    Benefits: Compliance International Funding Agencies Mandates ensure that publicly-funded research data is managed, described, and stored in life-long preservation formats to aid in discovery and reuse into the future by other researchers. National Science Foundation www.nsf.gov Data Management Plan Requirements Data Management Plan Requirements “Proposals submitted or due on or after "All applicants submitting funding January 18, 2011, must include a proposals to the MRC are required to supplementary document of no more include a Data Management Plan as an than two pages labelled “Data integral part of the application.“ Management Plan”. This supplementary document should describe how the www.mrc.ac.uk/Ourresearch/Ethicsresearchguidance/datasha proposal will conform to NSF policy on ring/DMPs/index.htm the dissemination and sharing of research results.” www.nsf.gov/bfa/dias/policy/dmp.jsp The University of Western Australia
  • 10.
    Benefits: Compliance Australian Funding Agencies The Australian Research Council (ARC) and the National Health and Medical Research Council (NHMRC) may soon require data management planning as part of their grant application process. Research Data Management prepares researchers for the expected future changes in Australian funding agency requirements in relation to research data management following overseas trends. Sharing research data to improve public health: full Australian Code of National Principles joint statement by Responsible of Intellectual funders of health research Property www.wellcome.ac.uk/About-us/Policy/Spotlight- Conduct of Management for issues/Data-sharing/Public-health-and- epidemiology/WTDV030690.htm Research Publicly Funded The NHMRC signed this www.nhmrc.gov.au/guid Research document which requires elines/publications/r39 www.arc.gov.au/pdf/01_ that: 01.pdf "standards of data Developed jointly by the NHMRC, management are developed, This document promotes data promoted and entrenched so the ARC and Universities Australia. management & sharing of publicly that research data can be funded IP outputs. shared routinely, and re-used effectively." The University of Western Australia
  • 11.
    Benefits: Compliance The Code TheAustralian Code for the Responsible Conduct of Research guides researchers and institutions in responsible research practices. Section 1: General principles of responsible research Section 2: Management of research data and primary materials – Proper management and retention of research data. – Researcher must decide which data to retain – May be determined by law, funding agency, publisher or a discipline’s convention. The University of Western Australia
  • 12.
    Benefits: Compliance UWA Code of Conduct for the Responsible Practice of Compliance with Research http://www.research.uwa.edu.au/staff/research-policy/guidelines Section 2 refers to the management of research data and primary materials and states: 2.1 Data (including electronic data) must be recorded in a durable and appropriately referenced form. 2.2 Data must be held for sufficient time to allow access and reference. Recommended a minimum 5 years from date publication, but up to 15 years for specific types (eg clinical studies) 2.3 Wherever possible, original data must be retained in the school or research centre in which it was generated... In all cases, prior to the publication of research findings a Location of Data Form must be completed. These guidelines should be seen as a framework for sound research practice and for the protection of individual research workers, including both staff and postgraduate research students, from possible misunderstandings. The University of Western Australia
  • 13.
    Benefits: Compliance Publishers Meets therequirements of publishers who have data sharing policies which require good data management strategies. Cell http://www.cell.c om/authors Science http://www.science mag.org/site/feature /contribinfo/prep/ge n_info.xhtml#dataav “One of the ail terms and Nature Publishing Group conditions of http://www.nature.com/authors/p olicies/availability.html publishing in Cell is “... authors are required to “All data necessary to that authors be willing to make materials, data and understand, assess, and extend distribute any materials and associated protocols the conclusions of the protocols used in the published promptly available to manuscript must be available to experiments to qualified readers.” any reader of Science.” researchers for their own use." The University of Western Australia
  • 14.
    Benefits: Efficiency • Improves research process. • Encourages systematic documentation and descriptions • Provides guidelines and procedures ensuring consistency. Images: http://uclafacultyassociation.blogspot.com.au/2012/10/uc-officials-release-thousands-of.html; http://senior.ceng.metu.edu.tr/2012/viceversa/document.php; The University of Western Australia
  • 15.
    Benefits: Access • Validation and verification • Enables collaborative research opportunities • Prevents duplication of research • Allows data sharing • Increases citations Image: http://coupeweb.ca/services.html; http://theaphidroom.wordpress.com/2012/06/07/data-and-specimens-sharing-opportunity-or-burden/; The University of Western Australia
  • 16.
    Benefits: Quality • Allows for data replication • Increases accuracy or reliability • Ensures research data integrity. Image: http://www.gentec-eo.com/products/calorimeters The University of Western Australia
  • 17.
    How to findthe Research Data Management Toolkit at UWA Research support services web page Research Data Management http://www.is.uwa.edu.au/research web page links to the LibGuide http://www.is.uwa.edu.au/research/research- data-management-toolkit The University of Western Australia
  • 18.
    Research Data ManagementToolkit - LibGuide 1. Intellectual Property 2. Documentation and Metadata 3. Storage and Backup 4. Sharing and Reuse 5. Retention and Disposal The University of Western Australia
  • 19.
    This is supportedby the Australian National Data Service (ANDS) ANDS is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative The University of Western Australia

Editor's Notes

  • #2 This training session will outline the expected roles of the librarian within the scope of Research Data management.I will be providing an introduction to RDM today and in the future we will be providing further, more detailed sessions on the various aspects of RDM.If you have any questions regarding what you see here, please feel free to ask me at the end of the session
  • #3 Academic researchers are dealing with a massive amount of research data. With the advancements in digital technologies research capabilities have expanded in leaps and bounds And researchers can accrue huge amounts of data in short periods of time. Now great volumes of data can be created within minutes (what used to take months).Within the science field, examples include microarray data, protein structure data, and DNA sequencing data.
  • #4 Here are some quotes relating to the data delugeUntil recently research outputs were primarily text (hardcopies or digital versions of the hardcopy). Researchers within academic institutions are creating a richresource of research data which has potential beyond the original scope of the project it was created for.
  • #5 I love this quote byWeigend – Data is the new oilThat’s why we need to protect it and make it accessible and reusable – but how?Using correct data management strategies!
  • #6 Researchers need to protect their data from destruction – as you can see this can occur in many ways…As a result -Storing data in the right place is vitalSo is storing it in retrievable formats!And file naming or labelling ensures legal complianceDoing this guarantees legal compliance to IP rights.
  • #7 This diagram was developed by the eResearch Support team to demonstrate Where RDM Planning fits within the Research Lifecycle.It shows the steps in the research life cycle.Step 1 – where the researcher has the initial conceptStep 2 – is planning for the entire projectStep 3 – is where the researchers create their grant/project proposalStep 4 – start their projectStep 5 – data collectionStep 6 – Conclusion of the projectStep 7 – Reporting & publicationBefore starting a project, researchers should be encouraged to document their Research Data Management Plan in THIS SECTOR of the lifecycle (before project start-up)The RDM Plan is then implemented throughout each phase of the research lifecycle. The Plan covers issues such as (these) – which I will cover in a little more detail todayownership, documentation, security, sharing and disposal of research dataLets say that the researcher has confirmed the IP rights of their data in the planning stage,In the Reporting & Publication phase, they will have predetermined who needs to be acknowledged or cited.Another example could be – The researcher needs to work out whether the data can be shared or re-used in the planning stage. They would need to determine whether there are any ethical issues or confidentiality issues associated with the data. This would affect how they label their datasets in the data collection phaseThis would also affect how they store their datasets in the Project Conclusion phase.It will also effect how they report their findings in the reporting and publication phase and who has access.Documenting all of these issues allows Research data to be used, reviewed and modified via follow-up projects beyond the scope of the original research project.
  • #8 The Australian National Data Service (ANDS) is creating the infrastructure for Australian researchers and research organisations to point to their research datasets in the Australian Research Data CommonsSo ANDS has the charter to build the ARDCThis publicises the existence of their research data collections. Making them discoverableGiving a competitive advantagePromoting collaborations and data sharing.Sharing research data from publicly funded research makes best use of the tax-payer’s dollar!In 2011 and 2012 UWA worked with the Australian National Data Service on a Seeding the Commons Project. ANDS funded and supported UWA in this project.Wemanually entered descriptions and metadata of UWA datasets in RDA.Other Australian Universities are doing the same thing.Research Data Australia (RDA) is a discovery service and catalogue for Australian research data collections.
  • #9 Effective data management has many benefits.Itwill ensure the responsible conduct of research in several keys areas which are outlined here. DATA Compliance, Efficiency, Access, Quality and Security.I will briefly go through them one at a time now.
  • #10 Read slide paragraphInternational funding agencies which require research data management planningincludeUS National Science Foundation (NSF)UK Medical Research Council (MRC)Bothwith theirData Management Plan RequirementsThe ARC and the NHMRC may soon require data management planning within the funding application process.And we should be ready to support them with this.
  • #11 In the Australian arena – as I said in the previous slide – our funding agencies may soon ask for data mgt as part of their grant application processThese documents point towards this trend.The Code was jointly developed by the NHMRC, ARC AND Universities Australia;And more recently,In 2011 seventeen signatories including the NHMRC signed a joint statement  which states:standards of data management are developed, promoted and entrenched so that research data can be shared routinely,and re-used effectively
  • #12 You may be familiar with the Code;As I mentioned in the previous slide, it was jointly developed by the ARC, NHMC and Universities AustraliaIn order to receive NHMRC funding, researchers must comply with the code.What is the code?Provides guidelines for institutions and researchers;Promotes best practices in responsible research; andPromotes research integrity;Section 1: coversGeneral principles of responsible researchSection 2: covers the Management of research data and primary materials
  • #13 The UWA Code of Conduct for the Responsible Practice of Research is based on the Aust. Code for the Resp Con ResIt is a guideline for best research practiceIt protects researchers, including both staff and postgraduate research students, from any misunderstandings (such as disputes of research claims)Section 2 refers to data formats, duration of retention and location of data
  • #14 RDM ALSO meets the requirements of publishers who have data sharing policies which require good data management strategies.Here are three examplesNatureScience andCell
  • #15 Now that’s the end of compliance as a benefit of RDM.. butRDM also promotes DataEfficiencyImproves management of data and the research process as a wholeEncourages systematic documentation and descriptions of the research data. Provides guidelines and procedures ensuring consistency of research
  • #16 Access is another benefit of RDMAllows researchers to validate and verify published results.Enables collaborative research opportunities thereby increasing the potential scale and scope of research.Prevents duplication of research within a particular field.Allows data sharing and future use when the data is preserved in retrievable formats.Increases citations for the researcher.
  • #17 Improved quality of research is another benefit of RDMAllows for data replication or reproducibility.Increases the accuracy or reliability of the data.Ensures research data integrity.
  • #19 The RDM Toolkit is ideal as a reference tool or go-to guide for our librarians.It provides all the links related to research data mgt issuesEverything that I will go through today is available in the Toolkit for your quick reference.You can also direct the researchers themselves to the toolkit.The Toolkit covers the 5 main areas of RDM
  • #20 Thanks you for listening and please feel free to ask questions or make any suggestions regarding this new area.