This document provides an introduction to research data management. It defines key concepts like research, research data, and the research data lifecycle. It discusses the importance of data sharing and outlines benefits such as enabling new research, reducing duplication, and providing credit to researchers. The document notes that most research data disappears over time unless properly managed. It also explains that research data can be complex with multiple researchers, data types, formats and standards involved. Metadata is described as important data about data. The challenges of preserving complex and transformed data through archiving are also covered.
A basic course on Research data management: part 1 - part 4Leon Osinski
Slides belonging to a basic course on research data management. The course consists of 4 parts:
Part 1: what and why
1.1 data management plans
Part 2: protecting and organizing your data
2.1 data safety and data security
2.2 file naming, organizing data (TIER documentation protocol)
Part 3: sharing your data
3.1 via collaboration platforms (during research)
3.2 via data archives (after your research)
Part 4: caring for your data, or making data usable
4.1 tidy data
4.2 documentation/metadata
4.3 licenses
4.4 open data formats
This slideshow was used in an Introduction to Research Data Management course for the Social Sciences Division, University of Oxford, on 2015-05-27. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Where is the opportunity for libraries in the collaborative data infrastructure?LIBER Europe
Presentation by Susan Reilly at Bibsys2013 on the opportunties for libraries and their role in the collaborative data infrastructure. Looks at data sharing, authentication, preservation and advocacy.
Introduction to research data management; Lecture 01 for GRAD521Amanda Whitmire
Lesson 1: Introduction to research data management. From a series of lectures from a 10-week, 2-credit graduate-level course in research data management (GRAD521, offered at Oregon State University).
The course description is: "Careful examination of all aspects of research data management best practices. Designed to prepare students to exceed funder mandates for performance in data planning, documentation, preservation and sharing in an increasingly complex digital research environment. Open to students of all disciplines."
Major course content includes: Overview of research data management, definitions and best practices; Types, formats and stages of research data; Metadata (data documentation); Data storage, backup and security; Legal and ethical considerations of research data; Data sharing and reuse; Archiving and preservation.
See also, "Whitmire, Amanda (2014): GRAD 521 Research Data Management Lectures. figshare. http://dx.doi.org/10.6084/m9.figshare.1003835. Retrieved 23:25, Jan 07, 2015 (GMT)"
A basic course on Research data management: part 1 - part 4Leon Osinski
Slides belonging to a basic course on research data management. The course consists of 4 parts:
Part 1: what and why
1.1 data management plans
Part 2: protecting and organizing your data
2.1 data safety and data security
2.2 file naming, organizing data (TIER documentation protocol)
Part 3: sharing your data
3.1 via collaboration platforms (during research)
3.2 via data archives (after your research)
Part 4: caring for your data, or making data usable
4.1 tidy data
4.2 documentation/metadata
4.3 licenses
4.4 open data formats
This slideshow was used in an Introduction to Research Data Management course for the Social Sciences Division, University of Oxford, on 2015-05-27. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Where is the opportunity for libraries in the collaborative data infrastructure?LIBER Europe
Presentation by Susan Reilly at Bibsys2013 on the opportunties for libraries and their role in the collaborative data infrastructure. Looks at data sharing, authentication, preservation and advocacy.
Introduction to research data management; Lecture 01 for GRAD521Amanda Whitmire
Lesson 1: Introduction to research data management. From a series of lectures from a 10-week, 2-credit graduate-level course in research data management (GRAD521, offered at Oregon State University).
The course description is: "Careful examination of all aspects of research data management best practices. Designed to prepare students to exceed funder mandates for performance in data planning, documentation, preservation and sharing in an increasingly complex digital research environment. Open to students of all disciplines."
Major course content includes: Overview of research data management, definitions and best practices; Types, formats and stages of research data; Metadata (data documentation); Data storage, backup and security; Legal and ethical considerations of research data; Data sharing and reuse; Archiving and preservation.
See also, "Whitmire, Amanda (2014): GRAD 521 Research Data Management Lectures. figshare. http://dx.doi.org/10.6084/m9.figshare.1003835. Retrieved 23:25, Jan 07, 2015 (GMT)"
Meeting the NSF DMP Requirement June 13, 2012IUPUI
June 13 version of the IUPUI workshop Meeting the NSF Data Management Plan Requirement: What you need to know. This workshop is co-sponsored by the Office of the Vice Chancellor for Research and the University Library.
Meeting the NSF DMP Requirement: March 7, 2012IUPUI
March 7 version of the IUPUI workshop Meeting the NSF Data Management Plan Requirement: What you need to know. This workshop is co-sponsored by the Office of the Vice Chancellor for Research and the University Library.
This presentation was delivered at the Elsevier Library Connect Seminar on 6 October 2014 in Johannesburg, 7 October 2014 in Durban and 9 October 2014 in Cape Town and gives an overview of the potential role that librarians can play in research data management
Good (enough) research data management practicesLeon Osinski
Slides of a lecture on research data management (RDM), given for 3rd year students (Eindhoven University of Technology, major Psychology & Technology), as part of the course 0HV90 Quantitative Research. At the end of the slides a handy summary 'Research data management basics in a nutshell' is added.
University of Bath Research Data Management training for researchersJez Cope
Slides from a workshop on Research Data Management for research staff and students at the University of Bath.
Part of the Research360 project (http://blogs.bath.ac.uk/research360).
Authors: Cathy Pink and Jez Cope, University of Bath
Data sharing archiving discovery, Bill MichenerAlison Specht
A presentation by Bill Michener (University of New Mexico and DataONE) about data sharing, archiving and discovery. It was an introduction to a session co-hosted by FRB-CESAB and CEFE (CNRS) in Montpellier.
This is a presentation for the Erwin Hahn Instiutute in Essen, explaining the background, functional design and technical architecture of the Donders Repository. Furthermore, it explains how it aligns with the DCCN project management and with the researchers workflow
This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
A workshop as part of the International Digital Curation Conference 2016 on DMP development and support. This presentation demonstrates how we can use data management plans as a source of information to better understand researcher data stewardship practices and how to support them. Be sure to see the slide notes to better understand the presentation (most slides are just photos/icons).
The Brain Imaging Data Structure and its use for fNIRSRobert Oostenveld
These slides were prepared for the NIRS toolkit course at the Donders, which due to the Corona crisis has been postponed. The slides present BIDS, explain how fNIRS often involves multiple signals, and relates the two to synchronization and data management
Donders Repository - removing barriers for management and sharing of research...Robert Oostenveld
This is the presentation I gave at the monthly meeting of the Donders Institute PhD council. It shortly explains the Donders Repository, but mainly addresses how to deal with direct and indirectly identifying personal data, with anonymization, pseudomimization and de-identification, and with blurring of research data prior to sharing.
Supplementary presentation slides from a lecture on digital preservation given at the University of the West of England (UWE) as part of the MSc in Library and Library Management, University of the West of England, Frenchay Campus, Bristol, March 10, 2010
Meeting the NSF DMP Requirement June 13, 2012IUPUI
June 13 version of the IUPUI workshop Meeting the NSF Data Management Plan Requirement: What you need to know. This workshop is co-sponsored by the Office of the Vice Chancellor for Research and the University Library.
Meeting the NSF DMP Requirement: March 7, 2012IUPUI
March 7 version of the IUPUI workshop Meeting the NSF Data Management Plan Requirement: What you need to know. This workshop is co-sponsored by the Office of the Vice Chancellor for Research and the University Library.
This presentation was delivered at the Elsevier Library Connect Seminar on 6 October 2014 in Johannesburg, 7 October 2014 in Durban and 9 October 2014 in Cape Town and gives an overview of the potential role that librarians can play in research data management
Good (enough) research data management practicesLeon Osinski
Slides of a lecture on research data management (RDM), given for 3rd year students (Eindhoven University of Technology, major Psychology & Technology), as part of the course 0HV90 Quantitative Research. At the end of the slides a handy summary 'Research data management basics in a nutshell' is added.
University of Bath Research Data Management training for researchersJez Cope
Slides from a workshop on Research Data Management for research staff and students at the University of Bath.
Part of the Research360 project (http://blogs.bath.ac.uk/research360).
Authors: Cathy Pink and Jez Cope, University of Bath
Data sharing archiving discovery, Bill MichenerAlison Specht
A presentation by Bill Michener (University of New Mexico and DataONE) about data sharing, archiving and discovery. It was an introduction to a session co-hosted by FRB-CESAB and CEFE (CNRS) in Montpellier.
This is a presentation for the Erwin Hahn Instiutute in Essen, explaining the background, functional design and technical architecture of the Donders Repository. Furthermore, it explains how it aligns with the DCCN project management and with the researchers workflow
This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
A workshop as part of the International Digital Curation Conference 2016 on DMP development and support. This presentation demonstrates how we can use data management plans as a source of information to better understand researcher data stewardship practices and how to support them. Be sure to see the slide notes to better understand the presentation (most slides are just photos/icons).
The Brain Imaging Data Structure and its use for fNIRSRobert Oostenveld
These slides were prepared for the NIRS toolkit course at the Donders, which due to the Corona crisis has been postponed. The slides present BIDS, explain how fNIRS often involves multiple signals, and relates the two to synchronization and data management
Donders Repository - removing barriers for management and sharing of research...Robert Oostenveld
This is the presentation I gave at the monthly meeting of the Donders Institute PhD council. It shortly explains the Donders Repository, but mainly addresses how to deal with direct and indirectly identifying personal data, with anonymization, pseudomimization and de-identification, and with blurring of research data prior to sharing.
Supplementary presentation slides from a lecture on digital preservation given at the University of the West of England (UWE) as part of the MSc in Library and Library Management, University of the West of England, Frenchay Campus, Bristol, March 10, 2010
Scholars and researchers are being asked by an increasing number of research sponsors and journals to outline how they will manage and share their research data. This is an introduction to data management and sharing practices with some specific information for Columbia University researchers.
Data Science: An Emerging Field for Future JobsJian Qin
Data deluge has become a reality in today's scientific research. What does it mean to future science workforce? How can you prepare yourself to embrace the data challenges and opportunities? This presentation will provide you with an overview of data science and what it means to you as future researchers and career scientists.
Disciplinary and institutional perspectives on digital curationMichael Day
Slides from a presentation jointly given by Alexander Ball and Michael Day of UKOLN in a panel session on Scientific Data Curation at the DigCCurr 2009 Conference, Chapel Hill, NC, USA, 2 April 2009
Lesson 2 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Introduction to research data managementdri_ireland
An Introduction to Research Data Management: slides from a presentation given online on May 12 2022, by Beth Knazook, Project Manager, Research Data. Covers topics such as: what are research data; why share research data; why DMPs are important; and where should you share your data?
A sponsored supplement produced for Jisc on how researchers can cope with the data deluge of modern research techniques. Published by Times Higher Education on 25 November 2009
This slideshow was used in an Introduction to Research Data Management course taught in the Social Sciences Division, University of Oxford, on 2014-01-27. It provides an overview of some key issues, focusing on long-term data management, sharing, and curation.
Presentation Title: Grand Challenges and Big Data: Implications for Public Participation in Scientific Research
Presenter: William Michener, Professor and PI/Director of DataONE, University Libraries, University of New Mexico
This slideshow was used in a Preparing Your Research Data for the Future course taught in the Medical Sciences Division, University of Oxford, on 2015-06-08. It provides an overview of some key issues, focusing on long-term data management, sharing, and curation.
Spring 2014 Data Management Lab: Session 1 Slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 3
Research Data Management for Researchers: Module 1: Intro to Data, Metadata and the Research Data Lifecycle
1. Research Data Management for Researchers:
Module 1: Intro to Data, Metadata and the Research Data Lifecycle
—DRAFT—
Glen Newton∗
Paul Budkewitsch†
∗
Glen.Newton@gmail.com
†
Paul.Budkewitsch@nrcan-rncan.gc.ca
1 / 79
2. Outline
Some definitions
Some definitions
Data Sharing
Research & Research Data Sharing
Data Lifecycle
Research Data Research & Research Data Lifecycle
Complexity
Data Archiving
Research Data Complexity
Data Management Roles
Data Archiving
Next Modules
Data Management Roles
Next Modules
2 / 79
3. Some definitions
Some definitions
What is:
Data Sharing
Research & Research s Research?
Data Lifecycle
s Research Data?
Research Data
Complexity s Research & Research Data Life Cycles?
Data Archiving
Data Management Roles
Next Modules
3 / 79
4. Some definitions
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
Data Sharing
4 / 79
5. Data is becoming more important
Some definitions
In the past, more emphasis was given to publications.
Data Sharing
This is changing.
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
5 / 79
6. Diepenbroek, M., Schindler, U., Grobe, H. 2008.
PANGAEA - An ICSU World Data Center as a Networked Publication and Library System for Geoscientific Data
http://hdl.handle.net/10013/epic.28613
7. Research Data Disappears
Some definitions
s The status quo is for most research data to (eventually) disappear:
Data Sharing
except for large well organized projects, historically most research
Research & Research
Data Lifecycle data collected has already disappeared.
Research Data s Not through malice, just through mismanagement or more
Complexity
Data Archiving
accurately a lack of management
Data Management Roles
Next Modules
7 / 79
8. Degradation in information content associated with data and metadata over time
Status quo
Information Content of Data and Metadata Time of publication
Specific details about problems with individual items or specific
dates of collection are lost relatively rapidly
General details about the data collection are lost
through time
Retirement or career change makes access by
scientists to “mental storage” difficult or unlikely
Accident may destroy Death of investigator and subse-
data and documentation quent loss of remaining records
Time
´
de la Sablonniere, Auger, Sabourin and Newton. 2010. Facilitating Data Sharing
in the Behavioral Sciences. Submitted to Data Science Journal.
After Michener et al. 1997, Nongeospatial Metadata for the Ecological Sciences.
Ecological Applications 7:1:330-342
10.1890/1051-0761(1997)007[0330:NMFTES]2.0.CO;2
9. Why Share data?
Some definitions
s encourages scientific enquiry and debate
Data Sharing
s enables scrutiny of research outcomes
Research & Research
Data Lifecycle s facilitates research beyond the scope of the original research
Research Data s leads to new collaborations between data users and data creators
Complexity
Data Archiving
s reduces the cost of duplicating data collection
Data Management Roles
s provides important resources for education and training
Next Modules s encourages the improvement and validation of research methods
s promotes the research that created the data and its outcomes
s can provide a direct credit to the researcher as a research output in
its own right
9 / 79
10. Benefits of Data Sharing
Some definitions “Within this new technological context, more widespread and efficient access
Data Sharing to and sharing of research data will have substantial benefits for public
Research & Research
Data Lifecycle
scientific research. Open access to, and sharing of, data reinforces open
Research Data
scientific inquiry, encourages diversity of analysis and opinion, promotes new
Complexity research, makes possible the testing of new or alternative hypotheses and
Data Archiving methods of analysis, supports studies on data collection methods and
Data Management Roles measurement, facilitates the education of new researchers, enables the
Next Modules exploration of topics not envisioned by the initial investigators, and permits
the creation of new data sets when data from multiple sources are combined.
Sharing and open access to publicly funded research data not only helps to
maximize the research potential of new digital technologies and networks, but
provides greater returns from the public investment in research.”
OECD. 2003. Promoting Access to Public Research Data for Scientific, Economic, and Social Development: OECD Follow
Up Group on Issues of Access to Publicly Funded Research Data.
http://dataaccess.ucsd.edu/Final Report 2003.pdf
10 / 79
11. Unpredicted re–use of data
Some definitions
s Data often has value beyond that planned or even imagined by the
Data Sharing
collector of the data
Research & Research
Data Lifecycle s And combining it with other data can often support the discovery of
Research Data emergent processes
Complexity
Data Archiving
Data Management Roles
Next Modules
11 / 79
12. Unpredicted re–use of data
Some definitions
What is the following?
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
12 / 79
13.
14. Page from a ship’s log
Some definitions
New Zealand, October 1769
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
14 / 79
15.
16.
17. These are Captain James Cook’s logs
Some definitions
“His Majestys Bark [a type of ship] Endeavour on Her Passage On the
Data Sharing
Coast of New Zealand from Poverty Bay to Southw
Research & Research
Data Lifecycle October 15th 1769; Course: S 20 ◦ E; Winds: Vary; Location:
Research Data 39◦ 50 180◦ 51 ; Moderate and fair weather...thunder and spitting
Complexity
Data Archiving
rain...” — Log 39, page 79. UK National Archives
Data Management Roles s Record of date, time, location (lat/long), the sea conditions and
Next Modules local weather conditions
s Now being mined by JISC, the University of Sunderland, the Met
Office Hadley Centre and the British Atmospheric Data Centre for
climate change research1
1
http://www.nationalarchives.gov.uk/news/stories/371.htm
17 / 79
18.
19. Some definitions
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
Research & Research Data Lifecycle
19 / 79
20. Research & Research Data Lifecycle
Some definitions
Various perspectives
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
20 / 79
21. Pepe,A. & Mayernik, M & Borgman, C. & Van de Sompel, H.
Technology to Represent Scientific Practice: Data, Life Cycles, and Value Chains
http://arxiv.org/abs/0906.2549
22. Lord, P., A. Macdonald, L. Lyon & D. Giarretta. 2004.
From Data Deluge to Data Curation. In Proceedings of the UK e-science
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/150.pdf
23. Lyon, L. 2007. Dealing with Data: Roles, Rights, Responsibilities and Relationships
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/reports/dealing with data report-final.pdf
24. Humphrey, C. 2006. e-Science and the Life Cycle of Research
http://datalib.library.ualberta.ca/∼humphrey/lifecycle-science060308.doc
25. Interagency Working Group on Digital Data. 2009. Harnessing the Power of Digital Data for Science and Society
http://www.whitehouse.gov/files/documents/ostp/opengov inbox/harnessing power web.pdf
26. Very complete view:
Bechhofer, S. & D. Roure & M. Gamble & C. Goble & I. Buchan. 2010.
Research Objects: Towards Exchange and Reuse of Digital Knowledge.
Nature Preceedings.
http://dx.doi.org/10.1038/npre.2010.4626.1
27. Some definitions
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
Research Data Complexity
27 / 79
28. Research Data Complexity
Some definitions
s Data
Data Sharing
s Metadata
Research & Research
Data Lifecycle s Transformations (derived data/metadata), combinations
Research Data s More Metadata
Complexity
Data Archiving
Data Management Roles
Next Modules
28 / 79
29. Research Data Complexity
Some definitions
Real research projects can have extremely complex data collection and
Data Sharing
management needs.
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
29 / 79
30. Wallis, J. 2008. Moving Archival Practices Upstream: An Exploration of the Life Cycle
of Ecological Sensing Data in Collaborative Field Research Lineage
Retrieval for Scientific Data Processing: A Survey.
The International Journal of Digital Curation 1:3
http://www.ijdc.net/index.php/ijdc/article/view/67
31. Research Data Complexity
Some definitions
Data collection and management complexity are effected by a number
Data Sharing
of factors. For example:
Research & Research
Data Lifecycle s multiple researchers
Research Data s multiple teams
Complexity
s inter– and multi– disciplinary
Data Archiving
s multiple jurisdictions
Data Management Roles
s multiple funding agencies
Next Modules
s multiple types of funding agencies (government, industry, philanthropic)
s long term projects
s human subjects
s multiple/competing vocabularies/ontologies/metadata standards
s data formats, data transformation, data quality control
31 / 79
32. What is Metadata?
Some definitions
“Metadata is structured information that describes, explains, locates, or
Data Sharing
otherwise makes it easier to retrieve, use, or manage an information
Research & Research
Data Lifecycle resource. Metadata is often called data about data or information about
Research Data information.”
Complexity
Data Archiving
Data Management Roles
Next Modules NISO. 2004. Understanding Metadata.
http://http://www.ijdc.net/index.php/ijdc/article/view/67
32 / 79
33. What is Metadata?
Some definitions
Three main types of metadata:
Data Sharing
Research & Research s “Descriptive metadata describes a resource for purposes such as
Data Lifecycle
discovery and identification. It can include elements such as title,
Research Data
Complexity abstract, author, and keywords.”
Data Archiving s “Structural metadata indicates how compound objects are put
Data Management Roles together, for example, how pages are ordered to form chapters. “
Next Modules s “Administrative metadata provides information to help manage a
resource, such as when and how it was created, file type and other
technical information, and who can access it.”
NISO. 2004. Understanding Metadata.
http://http://www.ijdc.net/index.php/ijdc/article/view/67
33 / 79
34. What is Metadata?
Some definitions
Administrative metadata usually divided into two types:
Data Sharing
Research & Research s “Rights management metadata, which deals with intellectual
Data Lifecycle
property rights.”
Research Data
Complexity s “Preservation metadata, which contains information needed to
Data Archiving archive and preserve a resource.”
Data Management Roles
Next Modules
NISO. 2004. Understanding Metadata.
http://http://www.ijdc.net/index.php/ijdc/article/view/67
34 / 79
35. Research Data Complexity
Some definitions
Real research projects often have data that is described by many
Data Sharing
metadata standards
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
35 / 79
37. Research Data Complexity
Some definitions
As data is transformed, translated, filtered, combined with other data in
Data Sharing
a research data workflow, lineage or provenance metadata can capture
Research & Research
Data Lifecycle the nature of these changes.
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
37 / 79
38. Bose, R. & Frew, J. 2005. Lineage Retrieval for Scientific Data Processing: A Survey.
ACM Computing Surveys 37:1
http://dx.doi.org/10.1145/1057977.1057978
39. Research Data Complexity
Some definitions
Some of these work flows can be very complex.
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
39 / 79
40. Davidson, S. & Freire, J. 2008. Provenance and scientific workflows: challenges and opportunities.
SIGMOD ’08: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data
http://dx.doi.org/10.1145/1376616.1376772
41. Freire,J. & Koop, D. & Santos, E. & Silva, C.T. 2008. Provenance for Computational Tasks: A Survey.
Computing in Science & Engineering
http://dx.doi.org/10.1109/MCSE.2008.79
42. Barga, R. & Digiampietri,L. 2008. Automatic capture and efficient storage of e-Science experiment provenance.
Concurrency and Computation: Practice and Experience 20:5:419-429
http://dx.doi.org/10.1002/cpe.1235
43. Bowers, S. & McPhillips, T. & Ludscher, B. 2008. Provenance in collection-oriented scientific workflows.
Concurrency and Computation: Practice and Experience 20:5:519-529
http://dx.doi.org/10.1002/cpe.1235
44. Research Data Complexity
Some definitions
Some transformations can cause metadata to become data!
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
44 / 79
45. Jones, M. and Schildhauer, P. and Reichman, O. and Bowers, Shawn. 2006.
The New Bioinformatics: Integrating Ecological Data from the Gene to the Biosphere.
Annual Review of Ecology, Evolution, and Systematics 37:1:519-544.
http://dx.doi.org/10.1146/annurev.ecolsys.37.091305.110031
46. Some definitions
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
Data Archiving
46 / 79
47. Data Archiving
Some definitions
s Medium
Data Sharing
s Migration
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
47 / 79
48. Medium
Some definitions
s The physical storage medium – both for analog and digital storage
Data Sharing
of information – has an expected lifespan.
Research & Research
Data Lifecycle s Digital media can deteriorate and alter the underlying data (bits) of
Research Data files well before their expected end of life
Complexity
Data Archiving
Data Management Roles
Next Modules
48 / 79
49. Miller, S. 2002. Bridging the Gap between Libraries and Data Archives: Progress Report.
Presentation at Joint Informations Systems Committee (JISC, UK) and NSF Digital Libraries Initiative All Projects Meeting, Edinburgh, Scotland.
http://gdc.ucsd.edu:8080/digarch/about-project/presentations/edinburgh2002/view
50. Medium
Some definitions
Any single project can have a number of initial physical media.
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
50 / 79
51. Diepenbroek, M., Schindler, U., Grobe, H. 2008.
PANGAEA - An ICSU World Data Center as a Networked Publication and Library System for Geoscientific Data
http://hdl.handle.net/10013/epic.28613
52. Migration
Some definitions
s Before the end–of–life of a medium, its contents need to be copied
Data Sharing
reliably (bits verified) to a new medium (the same kind or different)
Research & Research
Data Lifecycle s The provenance metadata needs to be updated when this occurs
Research Data s Sometimes the ability to read the old medium is difficult or not
Complexity
Data Archiving
possible, as the technology has progressed and due to the lack of
Data Management Roles
availability of the appropriate working readers (i.e. 9–track tape
Next Modules readers)
52 / 79
53. Some definitions
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
Data Management Roles
53 / 79
54. Data Management Roles
Some definitions
Understanding roles in the research data workflow is helpful in
Data Sharing
succcessfully managing data.
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
54 / 79
55. Data Management Roles
Some definitions
One view:
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
55 / 79
56. Pryor, G. & Donnelly, M. 2009. Skilling Up to Do Data: Whose Role, Whose Responsibility, Whose Career?
International Journal of Digital Curation 4:2
http://www.ijdc.net/index.php/ijdc/article/view/126
57. Roles & Responsibilities: Another view
Some definitions
Liz Lyon, Director, United Kingdom Office for Library and Information
Data Sharing
Networking (UKOLN)
Research & Research
Data Lifecycle
s Scientist: creation and use of data
Research Data
Complexity s Institution: curation of and access to data
Data Archiving s Data centre: curation of and access to data
Data Management Roles s User: use of 3rd party data
Next Modules s Funder: set/react to public policy drivers
s Publisher: maintain integrity of the scientific record
Directly from: Lyon, L. 2007. Dealing with Data: Roles, Rights, Responsibilities and Relationships
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/reports/dealing with data report-final.pdf
57 / 79
58. Roles & Responsibilities: Scientist
Some definitions s Rights:
Data Sharing
Research & Research
x Of first use.
Data Lifecycle x To be acknowledged.
Research Data x To expect IPR to be honoured.
Complexity
x To receive data training and advice.
Data Archiving
Data Management Roles s Responsibilities:
Next Modules x Manage data for life of project.
x Meet standards for good practice.
x Comply with funder / institutional data policies and respect IPR of
others.
x Work up data for use by others.
Directly from: Lyon, L. 2007. Dealing with Data: Roles, Rights, Responsibilities and Relationships
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/reports/dealing with data report-final.pdf
58 / 79
59. Roles & Responsibilities: Scientist (cont.)
Some definitions s Relationships:
Data Sharing
Research & Research
x With institution as employee.
Data Lifecycle x With subject community
Research Data x With data centre.
Complexity
x With funder of work.
Data Archiving
Data Management Roles
Directly from: Lyon, L. 2007. Dealing with Data: Roles, Rights, Responsibilities and Relationships
Next Modules http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/reports/dealing with data report-final.pdf
59 / 79
60. Roles & Responsibilities: Institution
Some definitions s Rights:
Data Sharing
Research & Research
x To be offered a copy of data.
Data Lifecycle
s Responsibilities:
Research Data
Complexity
x Set internal data management policy.
Data Archiving
x Manage data in the short term.
Data Management Roles
x Meet standards for good practice.
Next Modules x Provide training and advice to support scientists.
x Promote the repository service.
s Relationships:
x With scientist as employer.
x With data centre through expert staff.
Directly from: Lyon, L. 2007. Dealing with Data: Roles, Rights, Responsibilities and Relationships
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/reports/dealing with data report-final.pdf
60 / 79
61. Roles & Responsibilities: Data Centre
Some definitions s Rights:
Data Sharing
Research & Research
x To be offered a copy of data.
Data Lifecycle x To select data of long-term value.
Research Data
Complexity s Responsibilities:
Data Archiving
x Manage data for the long-term.
Data Management Roles
x Meet standards for good practice.
Next Modules x Provide training for deposit.
x Promote the repository service.
x Protect rights of data contributors.
x Provide tools for re-use of data.
Directly from: Lyon, L. 2007. Dealing with Data: Roles, Rights, Responsibilities and Relationships
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/reports/dealing with data report-final.pdf
61 / 79
62. Roles & Responsibilities: Data Centre (cont.)
Some definitions s Relationships:
Data Sharing
Research & Research
x With scientist as client
Data Lifecycle x With user communities.
Research Data x With institution through expert staff.
Complexity
x With funder of service.
Data Archiving
Data Management Roles
Directly from: Lyon, L. 2007. Dealing with Data: Roles, Rights, Responsibilities and Relationships
Next Modules http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/reports/dealing with data report-final.pdf
62 / 79
63. Roles & Responsibilities: User
Some definitions s Rights:
Data Sharing
Research & Research
x To re-use data (non-exclusive licence).
Data Lifecycle x To access quality metadata to inform usability.
Research Data
Complexity s Responsibilities:
Data Archiving
x Abide by licence conditions.
Data Management Roles
x Acknowledge data creators / curators.
Next Modules x Manage derived data effectively.
s Relationships:
x With data centre as supplier.
x With institution as supplier.
Directly from: Lyon, L. 2007. Dealing with Data: Roles, Rights, Responsibilities and Relationships
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/reports/dealing with data report-final.pdf
63 / 79
64. Roles & Responsibilities: Funder (1/2)
Some definitions s Rights:
Data Sharing
Research & Research
x To implement data policies.
Data Lifecycle x To require those they fund to meet policy obligations.
Research Data
Complexity s Responsibilities:
Data Archiving
x Consider wider public-policy perspective & stakeholder needs
Data Management Roles
x Participate in strategy co-ordination.
Next Modules x Develop policies with stakeholders.
x Participate in policy co-ordination, joint planning & fund service
delivery.
x Monitor and enforce data policies.
Directly from: Lyon, L. 2007. Dealing with Data: Roles, Rights, Responsibilities and Relationships
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/reports/dealing with data report-final.pdf
64 / 79
65. Roles & Responsibilities: Funder (2/2)
Some definitions s Responsibilities (cont.):
Data Sharing
Research & Research
x Resource post-project long-term data management.
Data Lifecycle x Act as advocate for data curation & fund expert advisory service(s).
Research Data x Support workforce capacity development of data curators.
Complexity
Data Archiving s Relationships:
Data Management Roles
x With scientist as funder.
Next Modules x With institution.
x With data centre as funder.
x With other funders.
x With other stakeholders as policy-maker and funder of services.
Directly from: Lyon, L. 2007. Dealing with Data: Roles, Rights, Responsibilities and Relationships
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/reports/dealing with data report-final.pdf
65 / 79
66. Roles & Responsibilities: Publisher
Some definitions s Rights:
Data Sharing
Research & Research
x To expect data are available to support publication.
Data Lifecycle x To request pre-publication data deposit in long-term repository.
Research Data
Complexity s Responsibilities:
Data Archiving
x Engage stakeholders in development of publication standards.
Data Management Roles
x Link to data to support publication standards.
Next Modules x Monitor & enforce public. standards.
s Relationships:
x With scientist as creator, author and reader.
x With data centres and institutions as suppliers.
Directly from: Lyon, L. 2007. Dealing with Data: Roles, Rights, Responsibilities and Relationships
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/reports/dealing with data report-final.pdf
66 / 79
67. Research Lifecycle
Some definitions
Where is the researcher in the lifecycle?
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
67 / 79
68. Interagency Working Group on Digital Data. 2009. Harnessing the Power of Digital Data for Science and Society
http://www.whitehouse.gov/files/documents/ostp/opengov inbox/harnessing power web.pdf
69. Research Lifecycle
Some definitions
The Research Lifecycle needs to evolve to support Data Management...
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
69 / 79
70. Lord, P. & A. Macdonald. 2003.
e-Science Curation Report: Data curation for e-Science in the UK: an audit to establish requirements for future curation and provision
http://www.jisc.ac.uk/uploaded documents/e-ScienceReportFinal.pdf
71. Lord, P. & A. Macdonald. 2003.
e-Science Curation Report: Data curation for e-Science in the UK: an audit to establish requirements for future curation and provision
http://www.jisc.ac.uk/uploaded documents/e-ScienceReportFinal.pdf
72. Lord, P. & A. Macdonald. 2003.
e-Science Curation Report: Data curation for e-Science in the UK: an audit to establish requirements for future curation and provision
http://www.jisc.ac.uk/uploaded documents/e-ScienceReportFinal.pdf
73. Diepenbroek, M., Schindler, U., Grobe, H. 2008.
PANGAEA - An ICSU World Data Center as a Networked Publication and Library System for Geoscientific Data
http://hdl.handle.net/10013/epic.28613
74. Research Data Commons
Some definitions
The future suggested by the Australian National Data Service is perhaps
Data Sharing
the most complete vision, as it includes in the greater societal context.
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
74 / 79
75. ANDS Technical Working Group. 2007. Towards the Australian Data Commons: A proposal for an Australian National Data Service
http://pfc.org.au/pub/Main/Data/TowardstheAustralianDataCommons.pdf
76. Some definitions
Data Sharing
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
Next Modules
76 / 79
77. Next Modules
Some definitions Not covered (brief introduction to...):
Data Sharing
Research & Research
s Module 2: More hands on
Data Lifecycle
x Barriers to sharing
Research Data
Complexity x In–depth data and metadata examples
Data Archiving x Research Data Plan
Data Management Roles x Data formats
Next Modules s Module 3: Canadian and International Research Data Activities and
Organizations
x CNC/CODATA
x National Consultation Access to Scientific Research Data (2004)
x Research Data Canada
x International
77 / 79
78. Acknowledgments
Some definitions
CNC/CODATA Committee; Margaret Haines, Carleton University;
Data Sharing
NRC/CISTI; Research Data Canada; Larry Speers.
Research & Research
Data Lifecycle
Research Data
Complexity
Data Archiving
Data Management Roles
Next Modules
78 / 79
79. Contact and license
Some definitions
s Contact: Glen Newton glen.newton@gmail.com
Data Sharing
s License: Creative Commons Attribution-Noncommercial-Share Alike
Research & Research
Data Lifecycle
2.5 Canada License;
Research Data Paternit-Pas d’Utilisation Commerciale-Partage des Conditions Initiales
Complexity
l’Identique 2.5 Canada
Data Archiving
s Copyright: c 2009-2010 National Research Council, Natural
Data Management Roles
Next Modules
Resources Canada, Glen Newton
s Note: Various components copyright their respective owners
79 / 79