An NIDDK Resource dknet.org
Are you ready for 2023?
dkNET Office Hours: New Data Mandates
Jeffrey S. Grethe
PI, NIDDK Information Network
Co-Director, FAIR Data Informatics Laboratory, UCSD
An NIDDK Resource dknet.org
January 25, 2023
• US National Institutes of Health new
data sharing policy goes into effect
• All data must be managed; most
data should be shared
• “As open as possible; as closed as
necessary”
• Mandates the inclusion, approval
and execution of a Data
Management and Sharing Plan
• (DMP + S = DMS)
https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-013.html
An NIDDK Resource dknet.org
The Details...
The effective date of the DMS Policy is January 25, 2023, including for:
• Competing grant applications that are submitted to NIH for the January
25, 2023 and subsequent receipt dates;
• Proposals for contracts that are submitted to NIH on or after January 25,
2023;
• NIH Intramural Research Projects conducted on or after January 25, 2023;
and
• Other funding agreements (e.g., Other Transactions) that are executed on
or after January 25, 2023, unless otherwise stipulated by NIH.
An NIDDK Resource dknet.org
More Details...
What
• Defines Scientific Data as: “The recorded factual material commonly accepted in the scientific
community as of sufficient quality to validate and replicate research findings, regardless of whether
the data are used to support scholarly publications. Scientific data do not include laboratory
notebooks, preliminary analyses, completed case report forms, drafts of scientific papers, plans for
future research, peer reviews, communications with colleagues, or physical objects, such as
laboratory specimens.”
• Even those scientific data not used to support a publication are considered scientific data and within
the final DMS Policy’s scope
When
• “[s]hared scientific data should be made accessible as soon as possible, and no later than the time of
an associated publication, or the end of the award/support period, whichever comes first.”
• Researchers may share data underlying publication during the period of award but may share other
data that have not yet led to a publication by the end of the award period.
Where
• Encourages the use of established repositories to the extent possible.
An NIDDK Resource dknet.org
More Details...
How
• NIH encourages data management and data sharing practices consistent with the FAIR data
principles
Funding
• Fees for long-term data preservation and sharing are allowable, but funds for these activities must be
spent during the performance period, even for scientific data and metadata preserved and shared
beyond the award period.
Repercussions
• After the end of the funding period, non-compliance with the NIH ICO-approved Plan may be taken
into account by NIH for future funding decisions for the recipient institution
The DMS Policy applies to all research, funded or conducted in whole or in part by NIH, that results in the
generation of scientific data. This includes research funded or conducted by extramural grants, contracts,
Intramural Research Projects, or other funding agreements regardless of NIH funding level or funding
mechanism.
An NIDDK Resource dknet.org
Data as a Research Product
Sound, reproducible scholarship rests upon a foundation of robust, accessible data. For this to be
so in practice as well as theory, data must be accorded due importance in the practice of
scholarship and in the enduring scholarly record…”
https://www.force11.org/group/joint-declaration-dat
a-citation-principles-final
Joint Declaration of Data Citation Principles
1. Data should be considered legitimate, citable products of research. Data citations should be
accorded the same importance in the scholarly record as citations of other research objects,
such as publications.
2. Data citations should facilitate giving scholarly credit and normative and legal attribution to all
contributors to the data, recognizing that a single style or mechanism of attribution may not be
applicable to all data.
3. In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data
should be cited.
A data citation looks like a regular citation
DOI
Full citation
DOI:10.34945/F5XW2P
Proper data citation = data citation metrics
https://datasetsearch.research.google.com/
An NIDDK Resource dknet.org
Good data management is the gateway to data sharing
Borghi J, Abrams S, Lowenberg D, Simms S, Chodacki J (2018) Support Your Data: A Research Data
Management Guide for Researchers. Research Ideas and Outcomes 4: e26439.
https://doi.org/10.3897/rio.4.e26439
An NIDDK Resource dknet.org
Changing the culture around data management and sharing
• Me
• Answer to the underpowered study
• Data sharing and good data
management are closely aligned
• Compliance with mandates
• Credit for the totality of my work
• Science and Society
• Transparency
• Reproducibility
• Reduced waste
• Driving discovery
• Future me
• One most likely to benefit from good
data management and sharing
through stable archives
• No one ever regretted annotating too
much
• My colleagues (and PI)
• Easy to engage with colleagues over
well annotated data and associated
code
• What happens when the post doc
leaves?
April 2021; National Academies of Science Workshop
An NIDDK Resource dknet.org
Resource sharing plan → Data Management and Sharing Plan
An NIDDK Resource dknet.org
Required Elements of the NIH Plan
Data Type
Related Tools, Software and/or Code
Standards
Data Preservation, Access, and Associated Timelines
Access, Distribution, or Reuse Considerations
Oversight of Data Management and Sharing
How compliance with the Plan will be monitored and managed,
frequency of oversight, and by whom (e.g., titles, roles).
Adapted from Ghosh et al. 2022
An NIDDK Resource dknet.org
Data Type
• Type and amount/size
• Modality
• Level of aggregation
• Degree of data processing
• Which data and why
• A brief listing of the metadata
• Other relevant data, and
• Associated documentation to
facilitate interpretation of the
scientific data.
Example of some metadata needed
• Acquisition: Instrument, Protocol
• Biosample: Tissue, Cell, Organoid
• Participant/Donor
• Assay
• Analytics
Voltage traces Spike times Fluorescence traces
Optophysiology MRI Microscopy
Adapted from Ghosh et al. 2022
An NIDDK Resource dknet.org
Standards
• Data formats
• CSV/TSV, PNG, Tiff
• NWB, NIfTI, OME.Zarr, SWC
• Data dictionaries
• Data identifiers
• Sub-01, Ses-02,
• Definitions
• Ontologies
• Unique identifiers
• UUID, RRID
• Other data documentation
• Quality assurance, control
xkcd: 927
Adapted from Ghosh et al. 2022
An NIDDK Resource dknet.org
What standards should I use?
● Repositories often enforce specific
standards for metadata and data
● Thinking about where your data will
end up before you start your
experiments will help you determine
how to collect, annotate and
organize your (meta)data
● Fairsharing.org maintains a database
of standards and policies across
biomedicine
An NIDDK Resource dknet.org
Related Tools, Software and/or Code
• Any specialized tools needed
to access or manipulate
shared scientific data
• How needed tools can be
accessed
• Whether such tools are
likely to remain available for
as long as the scientific data
remain available.
Adapted from Ghosh et al. 2022
This should include the following information, if applicable:
● Which statistical package or program was used to
manipulate the data, along with the version of the
software that was used and any packages, scripts, or
settings that were used or developed during the course of
the study, as well as how users can access the software
● Whether there were any custom workflows or pipelines
developed as part of the study necessary to analyze or
process the data, and how
● Whether there were any executable programs or macros
written as part of the study necessary to analyze or
process the data, as well as how users can access the code
An NIDDK Resource dknet.org
Data Preservation, Access, and Associated Timelines
• Name of the repository(ies)
• How data will be findable and
identifiable
• Timeline
Adapted from Ghosh et al. 2022
An NIDDK Resource dknet.org
Lesson: Think about where your
data will end up in the beginning
Best practice: Submit your data to repository specialized for your type of data or your
domain
..if there isn’t one, then there are also general purpose repositories available
dknet.org
An NIDDK Resource
Where Can I Deposit My Data?
• List of DK relevant
repositories,
recommended by NLM
and various journals
• Created in conjunction
with NIDDK
• Coming soon: FAIR data
wizard
● FAIR Standards
● Clinical Repositories
Information
● Data maintenance
● Data size limit and cost
● Dynamic database
https://dknet.org/about/Suggested-data-repositories-niddk
An NIDDK Resource dknet.org
Access, Distribution, or Reuse Considerations
• Informed consent
• Privacy and confidentiality
protections
• Whether access to scientific data
derived from humans will be
controlled
• Any restrictions imposed by
federal, Tribal, or state laws,
regulations, or policies, or existing
or anticipated agreements
• Any other considerations that may
limit the extent of data sharing.
Adapted from Ghosh et al. 2022
An NIDDK Resource dknet.org
Data Management and Sharing Plan
● Creating a good data management
and sharing plan allows you to:
○ Comply with NIH mandates
○ Ensure that you allocate
enough resources for
preparing and sharing your
data
○ Ensure that you collect your
data in a FAIR manner
○ Easily share data with yourself,
future you, your colleagues
and the scientific community
● dkNET provides links to resources
that can help https://dknet.org/rin/rigor-reproducibility-about
An NIDDK Resource dknet.org
DMP Tool questions (dmptool.org)
• How do you plan to provide access to your data?
• When will you make the data available?
• Which archive/repository/central database have you identified as a
place to deposit data?
• Will a data-sharing agreement be required?
• What metadata/documentation will be submitted alongside the data?
• What file formats will you use for your data, and why?
• What transformations will be necessary to prepare data for
preservation/data sharing?
• Do you need funding for the implementation of this data sharing plan?
• Research outputs, by type
• Local data management plan (this is not part of the DMP tool)
Adapted from Ghosh et al. 2022
An NIDDK Resource dknet.org
• Helps plan ahead
• Start alongside proposal, well
before data collection
• Human and technological
resources
• Cost: $$, Effort
• Cost
• Training/re-training
• Time
• Resources
It benefits your scientific work!
Data Management and Sharing Plan
• Benefit
• Reduction in $$, effort, less
surprises
• Organizing data and metadata
reduces transition effort
• Open data opens new avenues and
may reduce collection cost
• Using standards lowers
development cost
Adapted from Ghosh et al. 2022
An NIDDK Resource dknet.org
An NIDDK Resource dknet.org
Having trouble? Ask dkNET
Coming soon: The FAIR data wizard!
An NIDDK Resource dknet.org
dkNET Collection of Information
An NIDDK Resource dknet.org
https://nexus.od.nih.gov/all/2023/01/05/3-ways-to-p
repare-for-the-2023-nih-grants-conference/
dknet.org
An NIDDK Resource
Get involved in the dkNET Community
dkNET Homepage: dkNET.org
Check out dkNET Resour
Join Webinar
Follow us
@dknet_info
Check Out or Post
News and Funding
Opportunities
Blog, Calendar
Sign up email list
An NIDDK Resource dknet.org

dkNET Office Hours - "Are You Ready for 2023? New NIH Data Management and Sharing Mandates" 01/13/2023

  • 1.
    An NIDDK Resourcedknet.org Are you ready for 2023? dkNET Office Hours: New Data Mandates Jeffrey S. Grethe PI, NIDDK Information Network Co-Director, FAIR Data Informatics Laboratory, UCSD
  • 2.
    An NIDDK Resourcedknet.org January 25, 2023 • US National Institutes of Health new data sharing policy goes into effect • All data must be managed; most data should be shared • “As open as possible; as closed as necessary” • Mandates the inclusion, approval and execution of a Data Management and Sharing Plan • (DMP + S = DMS) https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-013.html
  • 3.
    An NIDDK Resourcedknet.org The Details... The effective date of the DMS Policy is January 25, 2023, including for: • Competing grant applications that are submitted to NIH for the January 25, 2023 and subsequent receipt dates; • Proposals for contracts that are submitted to NIH on or after January 25, 2023; • NIH Intramural Research Projects conducted on or after January 25, 2023; and • Other funding agreements (e.g., Other Transactions) that are executed on or after January 25, 2023, unless otherwise stipulated by NIH.
  • 4.
    An NIDDK Resourcedknet.org More Details... What • Defines Scientific Data as: “The recorded factual material commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications. Scientific data do not include laboratory notebooks, preliminary analyses, completed case report forms, drafts of scientific papers, plans for future research, peer reviews, communications with colleagues, or physical objects, such as laboratory specimens.” • Even those scientific data not used to support a publication are considered scientific data and within the final DMS Policy’s scope When • “[s]hared scientific data should be made accessible as soon as possible, and no later than the time of an associated publication, or the end of the award/support period, whichever comes first.” • Researchers may share data underlying publication during the period of award but may share other data that have not yet led to a publication by the end of the award period. Where • Encourages the use of established repositories to the extent possible.
  • 5.
    An NIDDK Resourcedknet.org More Details... How • NIH encourages data management and data sharing practices consistent with the FAIR data principles Funding • Fees for long-term data preservation and sharing are allowable, but funds for these activities must be spent during the performance period, even for scientific data and metadata preserved and shared beyond the award period. Repercussions • After the end of the funding period, non-compliance with the NIH ICO-approved Plan may be taken into account by NIH for future funding decisions for the recipient institution The DMS Policy applies to all research, funded or conducted in whole or in part by NIH, that results in the generation of scientific data. This includes research funded or conducted by extramural grants, contracts, Intramural Research Projects, or other funding agreements regardless of NIH funding level or funding mechanism.
  • 6.
    An NIDDK Resourcedknet.org Data as a Research Product Sound, reproducible scholarship rests upon a foundation of robust, accessible data. For this to be so in practice as well as theory, data must be accorded due importance in the practice of scholarship and in the enduring scholarly record…” https://www.force11.org/group/joint-declaration-dat a-citation-principles-final Joint Declaration of Data Citation Principles 1. Data should be considered legitimate, citable products of research. Data citations should be accorded the same importance in the scholarly record as citations of other research objects, such as publications. 2. Data citations should facilitate giving scholarly credit and normative and legal attribution to all contributors to the data, recognizing that a single style or mechanism of attribution may not be applicable to all data. 3. In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited.
  • 7.
    A data citationlooks like a regular citation DOI Full citation DOI:10.34945/F5XW2P
  • 8.
    Proper data citation= data citation metrics https://datasetsearch.research.google.com/
  • 9.
    An NIDDK Resourcedknet.org Good data management is the gateway to data sharing Borghi J, Abrams S, Lowenberg D, Simms S, Chodacki J (2018) Support Your Data: A Research Data Management Guide for Researchers. Research Ideas and Outcomes 4: e26439. https://doi.org/10.3897/rio.4.e26439
  • 10.
    An NIDDK Resourcedknet.org Changing the culture around data management and sharing • Me • Answer to the underpowered study • Data sharing and good data management are closely aligned • Compliance with mandates • Credit for the totality of my work • Science and Society • Transparency • Reproducibility • Reduced waste • Driving discovery • Future me • One most likely to benefit from good data management and sharing through stable archives • No one ever regretted annotating too much • My colleagues (and PI) • Easy to engage with colleagues over well annotated data and associated code • What happens when the post doc leaves? April 2021; National Academies of Science Workshop
  • 11.
    An NIDDK Resourcedknet.org Resource sharing plan → Data Management and Sharing Plan
  • 12.
    An NIDDK Resourcedknet.org Required Elements of the NIH Plan Data Type Related Tools, Software and/or Code Standards Data Preservation, Access, and Associated Timelines Access, Distribution, or Reuse Considerations Oversight of Data Management and Sharing How compliance with the Plan will be monitored and managed, frequency of oversight, and by whom (e.g., titles, roles). Adapted from Ghosh et al. 2022
  • 13.
    An NIDDK Resourcedknet.org Data Type • Type and amount/size • Modality • Level of aggregation • Degree of data processing • Which data and why • A brief listing of the metadata • Other relevant data, and • Associated documentation to facilitate interpretation of the scientific data. Example of some metadata needed • Acquisition: Instrument, Protocol • Biosample: Tissue, Cell, Organoid • Participant/Donor • Assay • Analytics Voltage traces Spike times Fluorescence traces Optophysiology MRI Microscopy Adapted from Ghosh et al. 2022
  • 14.
    An NIDDK Resourcedknet.org Standards • Data formats • CSV/TSV, PNG, Tiff • NWB, NIfTI, OME.Zarr, SWC • Data dictionaries • Data identifiers • Sub-01, Ses-02, • Definitions • Ontologies • Unique identifiers • UUID, RRID • Other data documentation • Quality assurance, control xkcd: 927 Adapted from Ghosh et al. 2022
  • 15.
    An NIDDK Resourcedknet.org What standards should I use? ● Repositories often enforce specific standards for metadata and data ● Thinking about where your data will end up before you start your experiments will help you determine how to collect, annotate and organize your (meta)data ● Fairsharing.org maintains a database of standards and policies across biomedicine
  • 16.
    An NIDDK Resourcedknet.org Related Tools, Software and/or Code • Any specialized tools needed to access or manipulate shared scientific data • How needed tools can be accessed • Whether such tools are likely to remain available for as long as the scientific data remain available. Adapted from Ghosh et al. 2022 This should include the following information, if applicable: ● Which statistical package or program was used to manipulate the data, along with the version of the software that was used and any packages, scripts, or settings that were used or developed during the course of the study, as well as how users can access the software ● Whether there were any custom workflows or pipelines developed as part of the study necessary to analyze or process the data, and how ● Whether there were any executable programs or macros written as part of the study necessary to analyze or process the data, as well as how users can access the code
  • 17.
    An NIDDK Resourcedknet.org Data Preservation, Access, and Associated Timelines • Name of the repository(ies) • How data will be findable and identifiable • Timeline Adapted from Ghosh et al. 2022
  • 18.
    An NIDDK Resourcedknet.org Lesson: Think about where your data will end up in the beginning Best practice: Submit your data to repository specialized for your type of data or your domain ..if there isn’t one, then there are also general purpose repositories available
  • 19.
    dknet.org An NIDDK Resource WhereCan I Deposit My Data? • List of DK relevant repositories, recommended by NLM and various journals • Created in conjunction with NIDDK • Coming soon: FAIR data wizard ● FAIR Standards ● Clinical Repositories Information ● Data maintenance ● Data size limit and cost ● Dynamic database https://dknet.org/about/Suggested-data-repositories-niddk
  • 20.
    An NIDDK Resourcedknet.org Access, Distribution, or Reuse Considerations • Informed consent • Privacy and confidentiality protections • Whether access to scientific data derived from humans will be controlled • Any restrictions imposed by federal, Tribal, or state laws, regulations, or policies, or existing or anticipated agreements • Any other considerations that may limit the extent of data sharing. Adapted from Ghosh et al. 2022
  • 21.
    An NIDDK Resourcedknet.org Data Management and Sharing Plan ● Creating a good data management and sharing plan allows you to: ○ Comply with NIH mandates ○ Ensure that you allocate enough resources for preparing and sharing your data ○ Ensure that you collect your data in a FAIR manner ○ Easily share data with yourself, future you, your colleagues and the scientific community ● dkNET provides links to resources that can help https://dknet.org/rin/rigor-reproducibility-about
  • 22.
    An NIDDK Resourcedknet.org DMP Tool questions (dmptool.org) • How do you plan to provide access to your data? • When will you make the data available? • Which archive/repository/central database have you identified as a place to deposit data? • Will a data-sharing agreement be required? • What metadata/documentation will be submitted alongside the data? • What file formats will you use for your data, and why? • What transformations will be necessary to prepare data for preservation/data sharing? • Do you need funding for the implementation of this data sharing plan? • Research outputs, by type • Local data management plan (this is not part of the DMP tool) Adapted from Ghosh et al. 2022
  • 23.
    An NIDDK Resourcedknet.org • Helps plan ahead • Start alongside proposal, well before data collection • Human and technological resources • Cost: $$, Effort • Cost • Training/re-training • Time • Resources It benefits your scientific work! Data Management and Sharing Plan • Benefit • Reduction in $$, effort, less surprises • Organizing data and metadata reduces transition effort • Open data opens new avenues and may reduce collection cost • Using standards lowers development cost Adapted from Ghosh et al. 2022
  • 24.
  • 25.
    An NIDDK Resourcedknet.org Having trouble? Ask dkNET Coming soon: The FAIR data wizard!
  • 26.
    An NIDDK Resourcedknet.org dkNET Collection of Information
  • 27.
    An NIDDK Resourcedknet.org https://nexus.od.nih.gov/all/2023/01/05/3-ways-to-p repare-for-the-2023-nih-grants-conference/
  • 28.
    dknet.org An NIDDK Resource Getinvolved in the dkNET Community dkNET Homepage: dkNET.org Check out dkNET Resour Join Webinar Follow us @dknet_info Check Out or Post News and Funding Opportunities Blog, Calendar Sign up email list
  • 29.