Dr Jeff Christiansen (QCIF) introduced med.data.edu.au, a national facility to provide petabyte-scale research data storage, and related high-speed networked computational services, to Australian medical and health research organisations.
Webinar: https://www.youtube.com/watch?v=5jwBwDJrWAs
Jeff Christiansen Snippet: https://www.youtube.com/watch?v=PV_vuUKRm6w
Transcript: https://www.slideshare.net/AustralianNationalDataService/transcript-storing-and-publishing-health-and-medical-data-16052017
ANDS health and medical data webinar 16 May. Storing and Publishing Health and Medical Data. Jeff Christiansen
1. Cloud-Based
Data Storage, Computing and Sharing
for Health and Medical Research
Dr Jeff Christiansen, Queensland Cyber Infrastructure Foundation (QCIF)
This work is licensed under the Creative Commons Attribution 4.0 International License (CC-BY-4.0). To view a copy of the license, visit https://creativecommons.org/licenses/by/4.0/
3. Research Lifecycle
Data is central
NHMRC Statement on Data Sharing - https://www.nhmrc.gov.au/grants-funding/policy/nhmrc-statement-data-sharing
4. Research Lifecycle
NHMRC Statement on Data Sharing - https://www.nhmrc.gov.au/grants-funding/policy/nhmrc-statement-data-sharing
Data Infrastructure Requirements:
Storage
Management (Organising, Sharing with collaborators)
Analysis Tools
Compute (Desktop, Institutional, Cloud, HPC)
Dissemination of Results and Data Access
(Sharing with others)
5. Health and Medical research data
• Includes data derived from, or directly related to, human beings.
• Single human beings may be individually identifiable* in a proportion of these data.
• If individually identifiable data also contains health, genetic, or biometric information, it is also considered to be sensitive#
• Sensitive data carries legal and ethical responsibilities in ensuring that information is not intentionally or inadvertently disclosed to
non-authorised individuals.
• Everybody associated with these data (Data Custodians, Researchers, Research Collaborators and Data Infrastructure Operators) have
a shared ethical responsibility to ensure harm does not come to any research participant through unauthorised release of identifiable
data.
• Risk management is required – how to use and share but in a suitably secure manner?
Data infrastructure operators have a duty to Data Custodians and Researchers to demonstrate that they:
• have appropriate levels of maturity and discipline in information security practice to store human-derived research data and
• have a repertoire of administrative, physical, and technical safeguards in place to assure custodians of the security of the information.
* - NHMRC National Statement on Ethical Conduct in Human Research (2007 – Updated May 2015) https://www.nhmrc.gov.au/guidelines-publications/e72
# - Privacy Act 1988 (Cth) https://www.legislation.gov.au/Series/C2004A03712
6. Health and Medical research data
• Includes data derived from, or directly related to, human beings.
• Single human beings may be individually identifiable* in a proportion of these data.
• If individually identifiable data also contains health, genetic, or biometric information, it is also considered to be sensitive#
• Sensitive data carries legal and ethical responsibilities in ensuring that information is not intentionally or inadvertently disclosed to
non-authorised individuals.
• Data Custodians, Researchers, Research Collaborators and Data Infrastructure Operators have a shared ethical responsibility to ensure
harm does not come to any research participant through unauthorised release of identifiable data.
• Risk management is required – how to use and share data but in a suitably safe manner?
* - NHMRC National Statement on Ethical Conduct in Human Research (2007 – Updated May 2015) https://www.nhmrc.gov.au/guidelines-publications/e72
# - Privacy Act 1988 (Cth) https://www.legislation.gov.au/Series/C2004A03712
7. Health and Medical research data
• Includes data derived from, or directly related to, human beings.
• Single human beings may be individually identifiable* in a proportion of these data.
• If individually identifiable data also contains health, genetic, or biometric information, it is also considered to be sensitive#
• Sensitive data carries legal and ethical responsibilities in ensuring that information is not intentionally or inadvertently disclosed to
non-authorised individuals.
• Data Custodians, Researchers, Research Collaborators and Data Infrastructure Operators have a shared ethical responsibility to ensure
harm does not come to any research participant through unauthorised release of identifiable data.
• Risk management is required – how to use and share data but in a suitably safe manner?
Data infrastructure operators have a duty to Data Custodians and Researchers to demonstrate that they:
• have appropriate levels of maturity and discipline in information security practice to store human-derived research data and
• have a repertoire of administrative, physical, and technical safeguards in place to assure custodians of the security of the information.
* - NHMRC National Statement on Ethical Conduct in Human Research (2007 – Updated May 2015) https://www.nhmrc.gov.au/guidelines-publications/e72
# - Privacy Act 1988 (Cth) https://www.legislation.gov.au/Series/C2004A03712
8. What is med.data.edu.au?
Nationally-funded Data Infrastructure for Health and Medical Research Data
NCRIS-funding through the RDSI (Research Data Storage Infrastructure) and RDS (Research Data Services) projects
10. What is med.data.edu.au?
Data Infrastructure
Cloud Storage (Networked via AARNet)
Management (Mediaflux, MyTardis, Aspera)
11. What is med.data.edu.au?
Data Infrastructure
Cloud Storage (Networked via AARNet)
Management (Mediaflux, MyTardis, Aspera)
Compute (Nectar Research Cloud, HPC)
Analysis Tools (BYO Software)
12. What is med.data.edu.au?
Data Infrastructure
Cloud Storage (Networked via AARNet)
Management (Mediaflux, MyTardis, Aspera)
Compute (Nectar Research Cloud, HPC)
Analysis Tools (BYO Software)
Dissemination of Results and Data Access (Data Registry)
13. What is med.data.edu.au?
Data Infrastructure
Cloud Storage (Networked via AARNet)
Management (Mediaflux, MyTardis, Aspera)
Compute (Nectar Research Cloud, HPC)
Analysis Tools (BYO Software)
Dissemination of Results and Data Access (Data Registry)
Resource Library
Protecting Personal Health Information in Research
Legislation (Cth, All States and Territories)
Codes, Policies and Best Practice (NHMRC, NIH)
IT security framework (ASD Info Security Manual)
14. What is med.data.edu.au?
Data Infrastructure
Cloud Storage (Networked via AARNet)
Management (Mediaflux, MyTardis, Aspera)
Compute (Nectar Research Cloud, HPC)
Analysis Tools (BYO Software)
Dissemination of Results and Data Access (Data Registry)
Resource Library
Protecting Personal Health Information in Research
Legislation (Cth, All States and Territories)
Codes, Policies and Best Practice (NHMRC, NIH)
IT security framework (ASD Info Security Manual)
Interactive Use Guide (Is med.data right for my data?)
18. Identifiable vs. non-identifiable data
The majority of 6PB data is re-identifiable/non-
identifiable.
For individually identifiable data a discussion about risk
management is required:
• What is the sensitivity level of the data?
• Use Cases: How, where and by who is the data to be
used?
Dictates the specifics of the IT security set-up (e.g.
encryption, authentication) required.
19. Identifiable vs. non-identifiable data
The majority of 6PB data is re-identifiable/non-
identifiable.
For individually identifiable data a discussion about risk
management is required:
• What is the sensitivity level of the data?
• Use Cases: How, where and by who is the data to be
used?
Dictates the specifics of the IT security set-up (e.g.
encryption, authentication) required.
Security policies are set by each Node Operator.
Happy to discuss with Data Custodians and Institutional
IT Security Officers:
• Use Cases
• Security set-up cf. ASD ISM Principles and Controls
Australian Signals Directorate Information Security Manual
https://www.asd.gov.au/infosec/ism/
20. Want to know more?
Contact Us
Interactive Use Guide
Funding
Operators