This slideshow was used in a Preparing Your Research Material for the Future course for the Humanities Division, University of Oxford, on 2017-02-22. It provides an overview of some key issues, focusing on the long-term management of data and other research material, including sharing and curation.
This slideshow was used in an Introduction to Research Data Management course taught for the Mathematical, Physical and Life Sciences Division, University of Oxford, on 2017-02-15. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
This slideshow was used in a Preparing Your Research Material for the Future course for the Humanities Division, University of Oxford, on 2016-11-16. It provides an overview of some key issues, focusing on the long-term management of data and other research material, including sharing and curation.
This slideshow was used in a research data management planning course taught at IT Services, University of Oxford, on 2017-02-01. It provides an overview of the elements of a data management plan, plus an introduction to some tools that can be used to build one. (The presentation has been very slightly edited: references to resources provided to course participants have been replaced with web links.)
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinaidatascienceiqss
The DataTags framework makes it easy for data producers to deposit, data publishers to store and distribute, and data users to access and use datasets containing confidential information, in a standardized and responsible way. The talk will first introduce the concepts and tools behind DataTags, and then focus on the user-facing component of the system - Tagging Server (available today at datatags.org). We will conclude by describing how future versions of Dataverse will use DataTags to automatically handle sensitive datasets, that can only be shared under some restrictions.
This slideshow was used in a Preparing Your Research Material for the Future course for the Humanities Division, University of Oxford, on 2017-02-22. It provides an overview of some key issues, focusing on the long-term management of data and other research material, including sharing and curation.
This slideshow was used in an Introduction to Research Data Management course taught for the Mathematical, Physical and Life Sciences Division, University of Oxford, on 2017-02-15. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
This slideshow was used in a Preparing Your Research Material for the Future course for the Humanities Division, University of Oxford, on 2016-11-16. It provides an overview of some key issues, focusing on the long-term management of data and other research material, including sharing and curation.
This slideshow was used in a research data management planning course taught at IT Services, University of Oxford, on 2017-02-01. It provides an overview of the elements of a data management plan, plus an introduction to some tools that can be used to build one. (The presentation has been very slightly edited: references to resources provided to course participants have been replaced with web links.)
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinaidatascienceiqss
The DataTags framework makes it easy for data producers to deposit, data publishers to store and distribute, and data users to access and use datasets containing confidential information, in a standardized and responsible way. The talk will first introduce the concepts and tools behind DataTags, and then focus on the user-facing component of the system - Tagging Server (available today at datatags.org). We will conclude by describing how future versions of Dataverse will use DataTags to automatically handle sensitive datasets, that can only be shared under some restrictions.
This slideshow was used in a Research Data Management Planning course taught at IT Services, University of Oxford, on 2015-02-18 and 2015-05-13. It provides an overview of the elements of a data management plan, plus an introduction to some tools that can be used to build one.
This slideshow was used in a Research Data Management Planning course taught at IT Services, University of Oxford, on 2015-11-04. It provides an overview of the elements of a data management plan, plus an introduction to some tools that can be used to build one.
This slideshow was used in a Preparing Your Research Material for the Future course for the Humanities Division, University of Oxford, on 2018-06-08. It provides an overview of some key issues, focusing on the long-term management of data and other research material, including sharing and curation.
This slideshow was used in a Preparing Your Research Data for the Future course taught in the Medical Sciences Division, University of Oxford, on 2015-06-08. It provides an overview of some key issues, focusing on long-term data management, sharing, and curation.
This slideshow was used in a data management planning course taught at IT Services, University of Oxford, on 2016-11-09. It provides an overview of the elements of a data management plan, plus an introduction to some tools that can be used to build one.
Identifying and linking data using persistent identifiers:
What are persistent identifiers and how do they help research discovery, accessibility and reproducibility?
Which identifier should you choose and when?
This talk was provided by publishing consultant, Maureen C. Kelly, during the NISO webinar, What Can I Do with This? Making It Easy for Scholars & Researchers to Utilize Content, held on January 11, 2017.
This slideshow was used in a Research Data Management Planning course taught at IT Services, University of Oxford, on 2016-02-08. It provides an overview of the elements of a data management plan, plus an introduction to some tools that can be used to build one. (The presentation has been very slightly edited: references to resources provided to course participants have been replaced with web links.)
This slideshow was used in a Research Data Management Planning course taught at IT Services, University of Oxford, on 2015-02-18 and 2015-05-13. It provides an overview of the elements of a data management plan, plus an introduction to some tools that can be used to build one.
This slideshow was used in a Research Data Management Planning course taught at IT Services, University of Oxford, on 2015-11-04. It provides an overview of the elements of a data management plan, plus an introduction to some tools that can be used to build one.
This slideshow was used in a Preparing Your Research Material for the Future course for the Humanities Division, University of Oxford, on 2018-06-08. It provides an overview of some key issues, focusing on the long-term management of data and other research material, including sharing and curation.
This slideshow was used in a Preparing Your Research Data for the Future course taught in the Medical Sciences Division, University of Oxford, on 2015-06-08. It provides an overview of some key issues, focusing on long-term data management, sharing, and curation.
This slideshow was used in a data management planning course taught at IT Services, University of Oxford, on 2016-11-09. It provides an overview of the elements of a data management plan, plus an introduction to some tools that can be used to build one.
Identifying and linking data using persistent identifiers:
What are persistent identifiers and how do they help research discovery, accessibility and reproducibility?
Which identifier should you choose and when?
This talk was provided by publishing consultant, Maureen C. Kelly, during the NISO webinar, What Can I Do with This? Making It Easy for Scholars & Researchers to Utilize Content, held on January 11, 2017.
This slideshow was used in a Research Data Management Planning course taught at IT Services, University of Oxford, on 2016-02-08. It provides an overview of the elements of a data management plan, plus an introduction to some tools that can be used to build one. (The presentation has been very slightly edited: references to resources provided to course participants have been replaced with web links.)
Overview of the Research on Open Educational Resources for Development (ROER4D) Open Data initiative, highlighting data management principles, the five pillars of the ROER4D data publication approach and the project de-identification approach.
An introduction to Research Data Management and Data Management Planning presented at the University of the West of England on Wednesday 9th July 2014.
The ROER4D Curation & Dissemination team provides an overview of the ROER4D open data initiative as well as some key insights and challenges experienced.
OU Library Research Support webinar: Working with research dataIzzyChad
Slides from a webinar delivered on 31st January 2018 for OU research staff and students. Covers practical strategies for managing research data, including policies, file naming, information security, metadata and working with sensitive data.
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...EUDAT
| www.eudat.eu | 1st Session: July 7, 2016.
In this webinar, Sarah Jones (DCC) and Marjan Grootveld (DANS) talked through the aspects that Horizon 2020 requires from a DMP. They discussed examples from real DMPs and also touched upon the Software Management Plan, which for some projects can be a sensible addition
Slides from Wednesday 1st August - Data in the Scholarly Communications Life Cycle Course which is part of the FORCE11 Scholarly Communications Institute.
Presenter - Natasha Simons
Introduction to research data managementdri_ireland
An Introduction to Research Data Management: slides from a presentation given online on May 12 2022, by Beth Knazook, Project Manager, Research Data. Covers topics such as: what are research data; why share research data; why DMPs are important; and where should you share your data?
Persistent Identifiers (PiDs) for research – why we have them, why there are so many PiD systems, how they work looking at a few examples (Handles, DOIs, ORCIDs), how to choose one, can PiD systems fail and what’s happening in the international PiD community
FAIR - Working Data - It's not just about FAIR publishing. Presented by John Morrissey from CSIRO at the C3DIS post conference workshop: Managed data – trusted research: an introduction to Research Data Management 31 may 2018 in Melbourne
dkNET Office Hours: NIH Data Management and Sharing Mandate 05/03/2024dkNET
Presenter: Jeffrey Grethe, PhD, Principal Investigator of NIDDK Information Network (dkNET), Center for Research in Biological Systems, University of California San Diego
For all proposals submitted on/after January 25 2023, NIH requires the sharing of data from all NIH funded studies. Do you have appropriate data management practices and sharing plans in place to meet these requirements? Have questions or need some help? Join the dkNET office hours to learn about NIH’s policy (NOT-OD-21-013) and resources that could help.
*Previous Office Hours Slides and Recording: https://dknet.org/rin/research-data-management
Upcoming Webinars Schedule: https://dknet.org/about/webinar
Similar to 2016 Ocean Sciences Meeting tutorial (20)
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
The increased availability of biomedical data, particularly in the public domain, offers the opportunity to better understand human health and to develop effective therapeutics for a wide range of unmet medical needs. However, data scientists remain stymied by the fact that data remain hard to find and to productively reuse because data and their metadata i) are wholly inaccessible, ii) are in non-standard or incompatible representations, iii) do not conform to community standards, and iv) have unclear or highly restricted terms and conditions that preclude legitimate reuse. These limitations require a rethink on data can be made machine and AI-ready - the key motivation behind the FAIR Guiding Principles. Concurrently, while recent efforts have explored the use of deep learning to fuse disparate data into predictive models for a wide range of biomedical applications, these models often fail even when the correct answer is already known, and fail to explain individual predictions in terms that data scientists can appreciate. These limitations suggest that new methods to produce practical artificial intelligence are still needed.
In this talk, I will discuss our work in (1) building an integrative knowledge infrastructure to prepare FAIR and "AI-ready" data and services along with (2) neurosymbolic AI methods to improve the quality of predictions and to generate plausible explanations. Attention is given to standards, platforms, and methods to wrangle knowledge into simple, but effective semantic and latent representations, and to make these available into standards-compliant and discoverable interfaces that can be used in model building, validation, and explanation. Our work, and those of others in the field, creates a baseline for building trustworthy and easy to deploy AI models in biomedicine.
Bio
Dr. Michel Dumontier is the Distinguished Professor of Data Science at Maastricht University, founder and executive director of the Institute of Data Science, and co-founder of the FAIR (Findable, Accessible, Interoperable and Reusable) data principles. His research explores socio-technological approaches for responsible discovery science, which includes collaborative multi-modal knowledge graphs, privacy-preserving distributed data mining, and AI methods for drug discovery and personalized medicine. His work is supported through the Dutch National Research Agenda, the Netherlands Organisation for Scientific Research, Horizon Europe, the European Open Science Cloud, the US National Institutes of Health, and a Marie-Curie Innovative Training Network. He is the editor-in-chief for the journal Data Science and is internationally recognized for his contributions in bioinformatics, biomedical informatics, and semantic technologies including ontologies and linked data.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
4. Scope
Imagine a project:
• that includes a well-thought out and
documented data management plan,
• and robust implementation of that
plan through out the project and
beyond.
• This talk is not for that project; it is for
the rest of us.
5. So why do we care about data
management?
• Internal reasons: do good research,
write papers, get tenure, win more
grants.
• External reasons: public access &
reproducibility
Risk of becoming dark data (Heidorn,
2008)
6. Why care about external access?
• Intangibles for an Investigator
• Maybe someday I’ll benefit from someone else’s
data
• Maybe I’ll learn something through informal dialogue
• Most science funding is from public resources and
should/could be considered a public trust resource
• Peer pressure
• Tangibles for an Investigator
• Increased efficiency
• My funders require it.
7. So why do we care about data
management?
• Internal reasons: do good research,
write papers, get tenure, win more
grants.
• External reasons: greater impact
10. What is the DMRC & do we really
need another Data Plan Project?
• Probably not
• The DMRC is not a Data Plan tool
• Unidata community requested help
with implementation
• Therefore, the DMRC is primarily a
curated list of tools for implementation
12. What the DMRC Offers
• Highlights requirements from funding
agencies;
• Points to Best Practices developed by
others in the Data Management
space;
• Sorts available tools by best practice;
• Details available tools.
13. Requirements
• Highlight data management funding
requirements from NASA, NOAA,
NSF
• These are the agencies that fund our
community so we try to stay up to
date, but remember the agency
posted information is always the
authority
23. What We Are Exploring
• Dataverse by Harvard
• Designed for sharing, archiving, and
citing data
• Allows you to create a DOI
• Allows you to store and make data
accessible in perpetuity
24. What We Are Exploring
Known Dataverse Characteristics:
• Largest single file limited to 10GB
• No limit to number of files
• Users create their own Dataverse
• Designate private or public
• Open to data from all science disciplines
• Does not corrupt at least some software
files (e.g. IDV bundles)
• FREE
25. What We Are Exploring
Possible Dataverse Contributions:
• Description (providing DOIs)
• Sharing (access for perpetuity)
• Preservation (static copy for perpetuity)
• Cost (free) very suitable for projects that
might otherwise become long-tail data
28. We Welcome Your Resource
Suggestions!
• Please visit:
http://goo.gl/forms/Ngp4Xu9nGr
29. Example Workflow Implementation
• Radar and Lidar data from the
University of Wyoming King Air
• Millersville University Plains Elevated
Convection at Night (PECAN) data
• North Carolina State University WRF
North Atlantic Model Outputs
30. Part of a larger effort: Agile Data
Curation
• Means taking implementable steps to
improve data management for
external access.
• Philosophically, it attempts to apply
lessons from agile software
development to data management.
31. Agile Curation Principles,
2nd Generation
(J.Young, K.Benedict, & C. Lenhardt, AGU 2015 Fall Meeting)
1) Delivery, access, use and citation of
research data are the primary measures of
success.
2) Maximize the impact of research data
through the continuous integration of
curation activities
3) Support unanticipated needs for and uses
of research data (and documentation) and
develop flexible systems to capture new
uses.
32. Agile Curation Principles,
2nd Generation
4) Make data open and accessible as early in the
process as possible.
5) Encourage crowd-sourced / community
feedback to improve and enhance the data.
Provide basic metadata for data available early
in the process even if the data are not finalized.
6) Identify key individuals in a research project
that have the requisite motivation, knowledge,
or ability to learn and get out of their way.
33. Agile Curation Principles,
2nd Generation continued
7) Data creators and data curators should work
closely throughout the data life story to ensure
the most efficient and streamlined process.
8) Identify the most effective method(s) for
maintaining close communication between the
data creators and curators involved and use
them.
9) Target the steady delivery of incremental
improvements to research data discovery,
access and use that is consistent with a
sustainable level of effort and available funding.
34. Agile Curation Principles,
2nd Generation continued
10) Start with the basics and only make systems
more complex as needed, while maintaining a
low bar to entry.
11)Continuous attention to technical excellence
and good design enhances agility.
12)Continuously develop a community of data
providers, curators and users that participate in
the evolution of the research data systems.
38. Unidata is one of the University Corporation
for Atmospheric Research (UCAR)'s
Community Programs (UCP), and is
funded primarily by the National Science
Foundation (Grant NSF-1344155).
Editor's Notes
This talk and effort is inspired by the desire to move projects currently at risk of becoming dark data to at least become long tail data. However, the concepts described maybe useful to projects currently in the long tail or even big head spectrum.
We need to recognize that there are at least two motivations for data management: internal reasons and external reasons. As researchers, there is a focus on our internal research needs but from a societal perspective the potentially greater value is from external access.
Agile curation is not focused on assisting you with the workflow for your internal goals (though their maybe benefits there too). Instead the focus is on helping researchers meet external data management challenges.
Internal workflows tend to be optimized at least based on the preferences of the individual researcher.
Public-access or external access from the perspective of most researchers is at best a secondary purpose. These workflows are not optimized in the same way. These photos are analogous examples. A sign may be put out notifying the public something is freely available but the quality statement may be questioned (sign says good free stuff but it is for upholstered furniture in snow), it may offer no quality descriptor, or even no sign notifying free access and instead relies on awareness of social conventions. Does this sound like our current public access approach?
Principles of agile curation
Balancing “Scientific Progress” and “Cyberinfrastructure Development” is, on the face of it, a significant challenge, but it requires acknowledging that many of the cyberinfrastructure solutions are scientifically driven. Regardless, the portfolio of initiatives that EarthCube has supported reflect both fundamental investments in cyberinfrastructure development, but also outreach and scientific advancement within the geosciences.