Dr Paul Wong
Research Integrity Advisor Data
Management Workshop
Senior Data Management Specialist
31 March 2017 Brisbane QUT
The Australian National Data Service (ANDS) makes
Australia’s research data assets more valuable for
researchers, research institutions and the nation.
Partnering Australian
research organisations
and co-funded 294 data
projects, totally $54M
Research Data Australia
Cite My Data / DOIs minting
Vocabulary service etc.
In 2015, conducted over
100 workshops, forums,
and webinars with over
4000 participants,
developed online
resources e.g. guides,
videos etc.
• 40+ guides organised
around different topics
• Content is a moving
target – changing
policy landscape, new
practices etc.
• Designed as a
community resource
• If you see gaps, we
want your help to
make them better
http://www.ands.org.au/guides
• A dedicated set of
webpages on data
management
• A community resource
• If you see gaps, we
want your help to
make them better
http://www.ands.org.au/working-with-data/data-management
https://projects.ands.org.au/policy.php
Research data: as input & output
Research data may include:
 Laboratory and field notes
 Raw experimental data
 Analysed data
 Simulations and software
 Databases
 Clinical data, including clinical
records
 Questionnaires/surveys
 Images and photographs
 Audio-visual materials
Moynihan's field notes,
Panama, 1958 – CC BY
https://flic.kr/p/dmXHkJ
Screen capture of “Computer simulation of March 22,
2014 landslide event near Oso, Washington, by David
L. George and Richard M. Iverson, USGS”
http://youtu.be/2NzHCOhKr7g CC BY
Creative arts research data
Research data in the creative arts may include:
 Audio-visual recordings of a creative work
 Visual diaries
 Journals
 Drawings
 Photographs
 Manuscripts
 Musical annotations
 3D models
Research Data: a Broad Church
Hand written letters
Images or photos
Soil samples
Tissue samples
Archeological dig sites
…..
Scanned & OCR version
Scanned digital version
Analysed result of samples
Analysed result of samples
3D models of the dig site
…..
Physical Digital
ANDS’ primary focus is digital data
Why Bother?
Why managing (digital) research data?
In fact, why bother managing anything?
• Prevent bad things from happening.
• Enable good things to happen.
Data and Research Integrity
Nature 533, 452–454 (26 May 2016) doi:10.1038/533452a. Reprint with permission © 2016 Macmillan
Data and Research Integrity
“The Availability of Research Data Declines Rapidly with Article Age”, Vine
et al, Current Biology, Volume 24, Issue 1, p94–97, 6 January 2014
• “For papers where authors reported the status of their data, the odds
of the data being extant decreased by 17% per year...”
• “Responses included authors being sure that the data were lost (e.g.,
on a stolen computer) or thinking that they might be stored in some
distant location (e.g., their parent’s attic) to authors having some
degree of certainty that the data are on a Zip or floppy disk in their
possession but no longer having the appropriate hardware to access
it.”
Make Data Awesome
Open Research Data Collection Showcase
http://www.ands.org.au/partners-and-
communities/projects/open-research-data-collection
#Dataimpact stories
http://www.ands.org.au/news-and-events/dataimpact
The companion case studies report of the Watt review
https://docs.education.gov.au/system/files/doc/other/20
151202_case_studies_volume_nc_0.pdf
Data Management in Practice
• One of ANDS’ guides to outline, in an easy to understand
practical framework, how research data can be managed
effectively in an institutional setting.
• 15 key points – with short descriptions, 7 pages long.
• Incorporating project management best practice
• Shared responsibilities model
• Continual data curation approach
• Road tested with librarians, data managers, researchers
and research support staff
The Current Thinking: FAIR
Findable, Accessible, Interoperable, Reusable
15 principles to ensure research data is FAIR
Mark D. Wilkinson et al. The FAIR Guiding Principles for
scientific data management and stewardship, Scientific
Data (2016). DOI: 10.1038/sdata.2016.18
“FAIRness is a prerequisite for proper data management and
data stewardship”
http://www.ands.org.au/__data/assets/pdf_file/0009/394056/research-data-management-in-practice.pdf
Continual data curation across domains
Data Curation as Documentation
Assigning metadata (structured data about the data)
• Who collected the data?
• Who funded the research project?
• When (and where) was it collected?
• Instruments and setting for collecting the data?
• Title of the dataset
• Methods used to process the data
• Etc. etc.
Light Touch Heavy Duty
Ecological
Geographic
Biological
Metadata
Structured
Detailed
Machine readable
Structured
Minimal
Human readable
What is Data Citation?
Data citation refers to the practice of providing a reference to
data in the same way as researchers routinely provide a
bibliographic reference to outputs such as journal articles,
reports and conference papers. Citing data is now recognised
as one of the key practices leading to recognition of data as a
primary research output.
http://www.ands.org.au/working-with-data/citation-and-
identifiers/data-citation
Data Citation Standard
A standard citation would include the following elements:
Author(s) (Year): Title. Publisher(s). DOI (if used)
Hanigan, Ivan (2012): Monthly drought data for Australia 1890-2008 using the
Hutchinson Drought Index. The Australian National University Australian Data
Archive. http://doi.org/10.4225/13/50BBFD7E6727A
Alternatively,
Author(s) (Year): Title. Version. Publisher(s). ResourceType. Identifier
Bradford, Matt; Murphy, Helen; Ford, Andrew; Hogan, Dominic; Metcalfe, Dan (2014):
CSIRO Permanent Rainforest Plots of North Queensland. v2. CSIRO. Data Collection.
http://doi.org/10.4225/08/53C4CC1D94DA0
http://www.ands.org.au/working-with-data/citation-and-identifiers/data-citation
Institutional Policy and
Procedures
Support services - people and
other means of providing
advice and support
IT Infrastructure - the
hardware, software and other
facilities
Metadata management - so
that data records can be
meaningful and fit for purpose
Institutional Data
Management
Framework
Pre Research
Data Management Plan
• data organisation and storage;
• metadata standards and guidelines;
• backups;
• archiving for long-term preservation;
• version control and derived data products;
• data sharing or publishing intentions, including licensing;
• ensuring security of confidential data;
• data synchronisation; and
• governance, roles and responsibilities.
Pre Research
Storage requirements may vary across domains
Publishing and Sharing Data
Metadata Research Data
Open Open
Open Closed
Closed Open
Closed Closed
Publishing and Sharing data ≠ Open Access to data
“Open” and “Closed” are relative concepts.
“Closed” ≈ conditional access based on individual permission
“Closed” ≈ conditional access based on roles
Post Research
Ethics Clearance and Data Access: A Case Study
Data Managing and Sharing Research Data: A Guide to Good Practice, SAGE 2014
https://uk.sagepub.com/en-gb/eur/managing-and-sharing-research-
data/book240297
https://commons.wikimedia.org/wiki/File%3AFoot_and_Mouth_Disease_Map_-_geograph.org.uk_-_564718.jpg
Colin Smith [CC BY-SA 2.0] (http://creativecommons.org/licenses/by-sa/2.0)], via Wikimedia Commons from Wikimedia Commons)
Ethics Clearance and Data Access: A Case Study
Health and Social Consequences of the Foot and Mouth Disease Epidemic in North Cumbria, 2001-
2003
(M. Mort Lancaster University 2006, funded by the Department of Health UK, Study Number 5407)
http://ukdataservice.ac.uk/use-data/guides/dataset/foot-and-mouth
http://discover.ukdataservice.ac.uk/catalogue/?sn=5407
• 54 local people were recruited to write weekly diaries over 18 months to describe their lives
and the recovery they observed around the area
• The study was supplemented with interviews and focus group discussions that included other
stakeholders
• The study obtained consent from participants before the research but did not get consent for
sharing and archiving data
• The research team and the Department of Health wanted to share and archive the data after
the completion of the research.
• Had to get consent retrospectively and needed expert advice from copyright specialists
http://www.ands.org.au/__data/assets/pdf_file/0009/394056/research-data-management-in-practice.pdf
The framework
treats DM as a
set of coordinated
activities to
preserve the
evidence base of
research findings
and to make the
evidence base
more accessible
and reusable in
the long run.
Wise Advise
https://nicolahemmings.wordpress.com/2016/04/05/mistakes-ive-made-as-
an-early-career-researcher/
Mistakes I’ve made as an early career researcher
APRIL 5, 2016
Nicola Hemmings (post-doc, University of Sheffield)
Failing to organise my data adequately (circa 2007).
Prepare your datasets like you would if you were giving them to a stranger
who knew nothing about them. Label, annotate and meticulously file your R
scripts. Incorporate read-me files into everything and write them for the
monkey that will be you in five years, when you return to your data and/or
analyses for some unforeseen but vitally important reason. Don’t get this
wrong. You will regret it.
Special Healthy Data Year
‘Sharing health-y data: challenges and solutions’ workshops ANDS
ran in all capital cities in 2016-2017
Attended by researchers, and staff from the library, research office
and ethics office
Topics covered
 The data sharing landscape: funders and publishers
 Data de-identification
 Ethics and informed consent
 Licensing data
 How research data can be published (mediated
access, metadata, repositories)
Special Healthy Data Year
Coming up:
Health and Medical Data: 3 Short Lunchtime Bites Webinars in May
2017
Workshops with Health Libraries Australia: 10 medical and health
research data Things 'train the trainer' workshops. 31 May in
Brisbane, 13 June in Melbourne, 14 July in Perth. For health
librarians.
Senior Data Management Specialist
Paul.Wong@ands.org.au
+61 2 6125 0586
Dr Paul Wong
With the exception of logos, third party images or where otherwise indicated, this
work is licensed under the Creative Commons 4.0 International Attribution
Licence.
ANDS is supported by the Australian Government through the
National Collaborative Research Infrastructure Strategy Program.
Monash University leads the partnership with the Australian
National University and CSIRO.

Research Data Management in practice, RIA Data Management Workshop Brisbane 2017

  • 1.
    Dr Paul Wong ResearchIntegrity Advisor Data Management Workshop Senior Data Management Specialist 31 March 2017 Brisbane QUT
  • 2.
    The Australian NationalData Service (ANDS) makes Australia’s research data assets more valuable for researchers, research institutions and the nation. Partnering Australian research organisations and co-funded 294 data projects, totally $54M Research Data Australia Cite My Data / DOIs minting Vocabulary service etc. In 2015, conducted over 100 workshops, forums, and webinars with over 4000 participants, developed online resources e.g. guides, videos etc.
  • 3.
    • 40+ guidesorganised around different topics • Content is a moving target – changing policy landscape, new practices etc. • Designed as a community resource • If you see gaps, we want your help to make them better http://www.ands.org.au/guides
  • 4.
    • A dedicatedset of webpages on data management • A community resource • If you see gaps, we want your help to make them better http://www.ands.org.au/working-with-data/data-management
  • 5.
  • 6.
    Research data: asinput & output Research data may include:  Laboratory and field notes  Raw experimental data  Analysed data  Simulations and software  Databases  Clinical data, including clinical records  Questionnaires/surveys  Images and photographs  Audio-visual materials Moynihan's field notes, Panama, 1958 – CC BY https://flic.kr/p/dmXHkJ Screen capture of “Computer simulation of March 22, 2014 landslide event near Oso, Washington, by David L. George and Richard M. Iverson, USGS” http://youtu.be/2NzHCOhKr7g CC BY
  • 7.
    Creative arts researchdata Research data in the creative arts may include:  Audio-visual recordings of a creative work  Visual diaries  Journals  Drawings  Photographs  Manuscripts  Musical annotations  3D models
  • 8.
    Research Data: aBroad Church Hand written letters Images or photos Soil samples Tissue samples Archeological dig sites ….. Scanned & OCR version Scanned digital version Analysed result of samples Analysed result of samples 3D models of the dig site ….. Physical Digital ANDS’ primary focus is digital data
  • 9.
    Why Bother? Why managing(digital) research data? In fact, why bother managing anything? • Prevent bad things from happening. • Enable good things to happen.
  • 10.
    Data and ResearchIntegrity Nature 533, 452–454 (26 May 2016) doi:10.1038/533452a. Reprint with permission © 2016 Macmillan
  • 11.
    Data and ResearchIntegrity “The Availability of Research Data Declines Rapidly with Article Age”, Vine et al, Current Biology, Volume 24, Issue 1, p94–97, 6 January 2014 • “For papers where authors reported the status of their data, the odds of the data being extant decreased by 17% per year...” • “Responses included authors being sure that the data were lost (e.g., on a stolen computer) or thinking that they might be stored in some distant location (e.g., their parent’s attic) to authors having some degree of certainty that the data are on a Zip or floppy disk in their possession but no longer having the appropriate hardware to access it.”
  • 12.
    Make Data Awesome OpenResearch Data Collection Showcase http://www.ands.org.au/partners-and- communities/projects/open-research-data-collection #Dataimpact stories http://www.ands.org.au/news-and-events/dataimpact The companion case studies report of the Watt review https://docs.education.gov.au/system/files/doc/other/20 151202_case_studies_volume_nc_0.pdf
  • 14.
    Data Management inPractice • One of ANDS’ guides to outline, in an easy to understand practical framework, how research data can be managed effectively in an institutional setting. • 15 key points – with short descriptions, 7 pages long. • Incorporating project management best practice • Shared responsibilities model • Continual data curation approach • Road tested with librarians, data managers, researchers and research support staff
  • 15.
    The Current Thinking:FAIR Findable, Accessible, Interoperable, Reusable 15 principles to ensure research data is FAIR Mark D. Wilkinson et al. The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data (2016). DOI: 10.1038/sdata.2016.18 “FAIRness is a prerequisite for proper data management and data stewardship”
  • 16.
  • 17.
  • 18.
    Data Curation asDocumentation Assigning metadata (structured data about the data) • Who collected the data? • Who funded the research project? • When (and where) was it collected? • Instruments and setting for collecting the data? • Title of the dataset • Methods used to process the data • Etc. etc.
  • 19.
    Light Touch HeavyDuty Ecological Geographic Biological Metadata Structured Detailed Machine readable Structured Minimal Human readable
  • 20.
    What is DataCitation? Data citation refers to the practice of providing a reference to data in the same way as researchers routinely provide a bibliographic reference to outputs such as journal articles, reports and conference papers. Citing data is now recognised as one of the key practices leading to recognition of data as a primary research output. http://www.ands.org.au/working-with-data/citation-and- identifiers/data-citation
  • 21.
    Data Citation Standard Astandard citation would include the following elements: Author(s) (Year): Title. Publisher(s). DOI (if used) Hanigan, Ivan (2012): Monthly drought data for Australia 1890-2008 using the Hutchinson Drought Index. The Australian National University Australian Data Archive. http://doi.org/10.4225/13/50BBFD7E6727A Alternatively, Author(s) (Year): Title. Version. Publisher(s). ResourceType. Identifier Bradford, Matt; Murphy, Helen; Ford, Andrew; Hogan, Dominic; Metcalfe, Dan (2014): CSIRO Permanent Rainforest Plots of North Queensland. v2. CSIRO. Data Collection. http://doi.org/10.4225/08/53C4CC1D94DA0 http://www.ands.org.au/working-with-data/citation-and-identifiers/data-citation
  • 22.
    Institutional Policy and Procedures Supportservices - people and other means of providing advice and support IT Infrastructure - the hardware, software and other facilities Metadata management - so that data records can be meaningful and fit for purpose Institutional Data Management Framework Pre Research
  • 23.
    Data Management Plan •data organisation and storage; • metadata standards and guidelines; • backups; • archiving for long-term preservation; • version control and derived data products; • data sharing or publishing intentions, including licensing; • ensuring security of confidential data; • data synchronisation; and • governance, roles and responsibilities. Pre Research
  • 24.
    Storage requirements mayvary across domains
  • 25.
    Publishing and SharingData Metadata Research Data Open Open Open Closed Closed Open Closed Closed Publishing and Sharing data ≠ Open Access to data “Open” and “Closed” are relative concepts. “Closed” ≈ conditional access based on individual permission “Closed” ≈ conditional access based on roles Post Research
  • 26.
    Ethics Clearance andData Access: A Case Study Data Managing and Sharing Research Data: A Guide to Good Practice, SAGE 2014 https://uk.sagepub.com/en-gb/eur/managing-and-sharing-research- data/book240297 https://commons.wikimedia.org/wiki/File%3AFoot_and_Mouth_Disease_Map_-_geograph.org.uk_-_564718.jpg Colin Smith [CC BY-SA 2.0] (http://creativecommons.org/licenses/by-sa/2.0)], via Wikimedia Commons from Wikimedia Commons)
  • 27.
    Ethics Clearance andData Access: A Case Study Health and Social Consequences of the Foot and Mouth Disease Epidemic in North Cumbria, 2001- 2003 (M. Mort Lancaster University 2006, funded by the Department of Health UK, Study Number 5407) http://ukdataservice.ac.uk/use-data/guides/dataset/foot-and-mouth http://discover.ukdataservice.ac.uk/catalogue/?sn=5407 • 54 local people were recruited to write weekly diaries over 18 months to describe their lives and the recovery they observed around the area • The study was supplemented with interviews and focus group discussions that included other stakeholders • The study obtained consent from participants before the research but did not get consent for sharing and archiving data • The research team and the Department of Health wanted to share and archive the data after the completion of the research. • Had to get consent retrospectively and needed expert advice from copyright specialists
  • 28.
    http://www.ands.org.au/__data/assets/pdf_file/0009/394056/research-data-management-in-practice.pdf The framework treats DMas a set of coordinated activities to preserve the evidence base of research findings and to make the evidence base more accessible and reusable in the long run.
  • 29.
    Wise Advise https://nicolahemmings.wordpress.com/2016/04/05/mistakes-ive-made-as- an-early-career-researcher/ Mistakes I’vemade as an early career researcher APRIL 5, 2016 Nicola Hemmings (post-doc, University of Sheffield) Failing to organise my data adequately (circa 2007). Prepare your datasets like you would if you were giving them to a stranger who knew nothing about them. Label, annotate and meticulously file your R scripts. Incorporate read-me files into everything and write them for the monkey that will be you in five years, when you return to your data and/or analyses for some unforeseen but vitally important reason. Don’t get this wrong. You will regret it.
  • 30.
    Special Healthy DataYear ‘Sharing health-y data: challenges and solutions’ workshops ANDS ran in all capital cities in 2016-2017 Attended by researchers, and staff from the library, research office and ethics office Topics covered  The data sharing landscape: funders and publishers  Data de-identification  Ethics and informed consent  Licensing data  How research data can be published (mediated access, metadata, repositories)
  • 31.
    Special Healthy DataYear Coming up: Health and Medical Data: 3 Short Lunchtime Bites Webinars in May 2017 Workshops with Health Libraries Australia: 10 medical and health research data Things 'train the trainer' workshops. 31 May in Brisbane, 13 June in Melbourne, 14 July in Perth. For health librarians.
  • 32.
    Senior Data ManagementSpecialist Paul.Wong@ands.org.au +61 2 6125 0586 Dr Paul Wong With the exception of logos, third party images or where otherwise indicated, this work is licensed under the Creative Commons 4.0 International Attribution Licence. ANDS is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy Program. Monash University leads the partnership with the Australian National University and CSIRO.

Editor's Notes

  • #3 Note: Data is a very broad church within research. ANDS’ focus is primarily on digital data – not data in the “physical form” e.g. soil or tissue samples. Inclusive in our understanding of digital data, from all disciplines, e.g. quantitative and qualitative data, multi-media data (audio, images, videos), computer models.
  • #4 About 40 guides have been developed across a spectrum of data management topics
  • #17 15 key points – with short descriptions, 7 pages long. incorporating project management best practice Shared responsibilities model Continual data curation approach