Slides belonging to a basic course on research data management. The course consists of 4 parts:
Part 1: what and why
1.1 data management plans
Part 2: protecting and organizing your data
2.1 data safety and data security
2.2 file naming, organizing data (TIER documentation protocol)
Part 3: sharing your data
3.1 via collaboration platforms (during research)
3.2 via data archives (after your research)
Part 4: caring for your data, or making data usable
4.1 tidy data
4.2 documentation/metadata
4.3 licenses
4.4 open data formats
Good (enough) research data management practicesLeon Osinski
Slides of a lecture on research data management (RDM), given for 3rd year students (Eindhoven University of Technology, major Psychology & Technology), as part of the course 0HV90 Quantitative Research. At the end of the slides a handy summary 'Research data management basics in a nutshell' is added.
Research Data Management and the Research Data Lifecycle: a Gentle IntroductionGlen Newton
This document provides a gentle introduction to research data management and the research data lifecycle. It defines key terms like research, research data, and research data lifecycles. It discusses the benefits of data sharing, including enabling new research and testing new hypotheses. Research data can be complex with data, metadata, transformations, and combinations. The document outlines the research data lifecycle from collection to archiving and roles in data management.
Data Management in the context of Open Science.
Because open access become mandatory for publications and project-funded research data, it is the responsibility of each researcher to be informed and then trained in new practices.
This document provides an overview of research data management and outlines the steps for creating a data management plan. It discusses why research data management is important, including enabling data reuse and sharing and meeting funder requirements. The document then walks through creating a data management plan, covering topics like the types and formats of data that will be generated, ethical and intellectual property issues, how data will be stored and backed up, and long-term preservation and deposition of data. It emphasizes that planning early helps ensure accurate, complete and secure data, and avoids problems down the line.
What funders want you to do with your dataLeon Osinski
Funders want researchers to 1) deposit the relevant data from their research in an approved repository to make it FAIR (Findable, Accessible, Interoperable, Reusable), 2) make the data openly available whenever possible, and 3) write a Data Management Plan describing how they will manage their data during and after the project. Funders require depositing data in repositories to enable reuse, making data open access "as open as possible, as closed as necessary", and having a Data Management Plan that addresses reuse according to FAIR principles.
Research data management at TU EindhovenLeon Osinski
The document discusses research data management at TU Eindhoven. It outlines the long process of developing RDM practices since 2008. It describes the current organization and governance structure for RDM. Key external requirements for RDM from funders, regulations, and integrity standards are also summarized. The document concludes by outlining RDM support services available and the benefits of good RDM practices.
Research Data Management for Researchers: Module 1: Intro to Data, Metadata a...Glen Newton
This document provides an introduction to research data management. It defines key concepts like research, research data, and the research data lifecycle. It discusses the importance of data sharing and outlines benefits such as enabling new research, reducing duplication, and providing credit to researchers. The document notes that most research data disappears over time unless properly managed. It also explains that research data can be complex with multiple researchers, data types, formats and standards involved. Metadata is described as important data about data. The challenges of preserving complex and transformed data through archiving are also covered.
A basic course on Research data management, part 1: what and whyLeon Osinski
A basic course on research data management for PhD students. The course consists of 4 parts. The course was given at Eindhoven University of Technology (TUe), 24-01-2017
Good (enough) research data management practicesLeon Osinski
Slides of a lecture on research data management (RDM), given for 3rd year students (Eindhoven University of Technology, major Psychology & Technology), as part of the course 0HV90 Quantitative Research. At the end of the slides a handy summary 'Research data management basics in a nutshell' is added.
Research Data Management and the Research Data Lifecycle: a Gentle IntroductionGlen Newton
This document provides a gentle introduction to research data management and the research data lifecycle. It defines key terms like research, research data, and research data lifecycles. It discusses the benefits of data sharing, including enabling new research and testing new hypotheses. Research data can be complex with data, metadata, transformations, and combinations. The document outlines the research data lifecycle from collection to archiving and roles in data management.
Data Management in the context of Open Science.
Because open access become mandatory for publications and project-funded research data, it is the responsibility of each researcher to be informed and then trained in new practices.
This document provides an overview of research data management and outlines the steps for creating a data management plan. It discusses why research data management is important, including enabling data reuse and sharing and meeting funder requirements. The document then walks through creating a data management plan, covering topics like the types and formats of data that will be generated, ethical and intellectual property issues, how data will be stored and backed up, and long-term preservation and deposition of data. It emphasizes that planning early helps ensure accurate, complete and secure data, and avoids problems down the line.
What funders want you to do with your dataLeon Osinski
Funders want researchers to 1) deposit the relevant data from their research in an approved repository to make it FAIR (Findable, Accessible, Interoperable, Reusable), 2) make the data openly available whenever possible, and 3) write a Data Management Plan describing how they will manage their data during and after the project. Funders require depositing data in repositories to enable reuse, making data open access "as open as possible, as closed as necessary", and having a Data Management Plan that addresses reuse according to FAIR principles.
Research data management at TU EindhovenLeon Osinski
The document discusses research data management at TU Eindhoven. It outlines the long process of developing RDM practices since 2008. It describes the current organization and governance structure for RDM. Key external requirements for RDM from funders, regulations, and integrity standards are also summarized. The document concludes by outlining RDM support services available and the benefits of good RDM practices.
Research Data Management for Researchers: Module 1: Intro to Data, Metadata a...Glen Newton
This document provides an introduction to research data management. It defines key concepts like research, research data, and the research data lifecycle. It discusses the importance of data sharing and outlines benefits such as enabling new research, reducing duplication, and providing credit to researchers. The document notes that most research data disappears over time unless properly managed. It also explains that research data can be complex with multiple researchers, data types, formats and standards involved. Metadata is described as important data about data. The challenges of preserving complex and transformed data through archiving are also covered.
A basic course on Research data management, part 1: what and whyLeon Osinski
A basic course on research data management for PhD students. The course consists of 4 parts. The course was given at Eindhoven University of Technology (TUe), 24-01-2017
The document provides information about MANTRA, a free online course for research data management created by the University of Edinburgh. MANTRA teaches best practices for managing research data through open educational modules aligned with the research data lifecycle. It is available for reuse and repurposing under an open license. The course covers topics like data planning, organization, documentation, storage, security, and sharing.
Managing data throughout the research lifecycleMarieke Guy
This document summarizes a presentation about managing data throughout the research lifecycle. It discusses the stages of the research lifecycle, including planning, data creation, documentation, storage, sharing, and preservation. It provides examples of research lifecycle models and addresses key questions to consider at each stage, such as what formats to use, how to document data, where to store it, and how to share and preserve it. The presentation emphasizes making informed decisions about data management and talking to colleagues for support and advice.
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Natsuko Nicholls
The document discusses data sharing policies and mandates from various organizations including federal funding agencies in the US and internationally, journals, and a paradigm shift toward more transparent and collaborative research that integrates publications and data. Key points include requirements for data management plans from NIH and NSF, expectations of funding agencies in other countries to maximize access to research data, a journal policy requiring data to be made available, and challenges around measuring the impact of shared data given the lack of common practices and standards for citing data.
A basic course on Research data management, part 4: caring for your data, or ...Leon Osinski
A basic course on research data management for PhD students. The course consists of 4 parts. The course was given at Eindhoven University of Technology (TUe), 24-01-2017
This slideshow was used in an Introduction to Research Data Management course for the Social Sciences Division, University of Oxford, on 2015-05-27. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
This presentation was delivered at the Elsevier Library Connect Seminar on 6 October 2014 in Johannesburg, 7 October 2014 in Durban and 9 October 2014 in Cape Town and gives an overview of the potential role that librarians can play in research data management
The document provides an overview of the Donders Repository, which aims to securely store original research data, document the research process, and make data accessible to researchers and the public. It describes the procedural design including different roles, collection types, and states. The technical architecture is based on IRODS software and scalable storage. The repository fits into researchers' workflows and supports the timeline of projects from initiation to data sharing. Standards like BIDS help make neuroimaging data FAIR (Findable, Accessible, Interoperable, Reusable).
This document provides an introduction to data management. It discusses why data management is important, covering key aspects like developing data management plans, file organization, documentation and metadata, storage and backup, legal and ethical considerations, sharing and reuse, and preservation. Effective data management is critical for research success as it supports reproducibility, sharing, and preventing data loss. The document outlines best practices and resources like the library that can help with developing strong data management strategies.
University of Bath Research Data Management training for researchersJez Cope
Slides from a workshop on Research Data Management for research staff and students at the University of Bath.
Part of the Research360 project (http://blogs.bath.ac.uk/research360).
Authors: Cathy Pink and Jez Cope, University of Bath
Using Open Science to advance science - advancing open data Robert Oostenveld
This document discusses using open science practices like open data to advance science. It notes the benefits of open data like improved reproducibility and opportunities for data mining. However, sharing neuroimaging and other human subject data presents challenges regarding data size, sensitivity, and privacy regulations. The document promotes using the Brain Imaging Data Structure (BIDS) format to organize data in an open, standardized way. It also discusses the gradient between personal/identifiable data that requires protection and de-identified research data that can be shared, as well as legal constraints and appropriate repositories for sharing data responsibly.
This slideshow was used at a lunchtime session delivered at the Humanities Division, University of Oxford, on 2014-05-12. It provides a general overview of some key data management topics, plus some pointers on where to find further information.
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
A workshop as part of the International Digital Curation Conference 2016 on DMP development and support. This presentation demonstrates how we can use data management plans as a source of information to better understand researcher data stewardship practices and how to support them. Be sure to see the slide notes to better understand the presentation (most slides are just photos/icons).
DataONE Education Module 03: Data Management PlanningDataONE
Lesson 3 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
This document provides an introduction to research data management for geoscience PhD students. It defines research data and different data types. It discusses the importance of managing data throughout its lifecycle for efficient and valid research. It outlines funder requirements, university policies, and activities involved in good research data management like data planning, documentation, storage, sharing and preservation.
Data Literacy: Creating and Managing Reserach Datacunera
This document discusses best practices for creating and managing research data. It covers defining data, the importance of data management, developing a data management plan, file naming conventions, metadata, data sharing and preservation. Key points include making a data management plan addressing types of data, standards, access and sharing policies; using descriptive file names with dates; storing multiple versions of data; and including metadata to explain the data. Resources for data management support are provided.
DataONE Education Module 01: Why Data Management?DataONE
Lesson 1 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Introduction to research data management; Lecture 01 for GRAD521Amanda Whitmire
Lesson 1: Introduction to research data management. From a series of lectures from a 10-week, 2-credit graduate-level course in research data management (GRAD521, offered at Oregon State University).
The course description is: "Careful examination of all aspects of research data management best practices. Designed to prepare students to exceed funder mandates for performance in data planning, documentation, preservation and sharing in an increasingly complex digital research environment. Open to students of all disciplines."
Major course content includes: Overview of research data management, definitions and best practices; Types, formats and stages of research data; Metadata (data documentation); Data storage, backup and security; Legal and ethical considerations of research data; Data sharing and reuse; Archiving and preservation.
See also, "Whitmire, Amanda (2014): GRAD 521 Research Data Management Lectures. figshare. http://dx.doi.org/10.6084/m9.figshare.1003835. Retrieved 23:25, Jan 07, 2015 (GMT)"
The Brain Imaging Data Structure and its use for fNIRSRobert Oostenveld
These slides were prepared for the NIRS toolkit course at the Donders, which due to the Corona crisis has been postponed. The slides present BIDS, explain how fNIRS often involves multiple signals, and relates the two to synchronization and data management
This document discusses subprograms and parameter passing in programming languages. It covers fundamental concepts of subprograms like definitions, calls, headers, and parameters. It then describes different parameter passing methods like pass-by-value, pass-by-reference, and pass-by-name. It also discusses how major languages like C, C++, Java, Ada, C#, and PHP implement parameter passing and type checking.
The document provides information about MANTRA, a free online course for research data management created by the University of Edinburgh. MANTRA teaches best practices for managing research data through open educational modules aligned with the research data lifecycle. It is available for reuse and repurposing under an open license. The course covers topics like data planning, organization, documentation, storage, security, and sharing.
Managing data throughout the research lifecycleMarieke Guy
This document summarizes a presentation about managing data throughout the research lifecycle. It discusses the stages of the research lifecycle, including planning, data creation, documentation, storage, sharing, and preservation. It provides examples of research lifecycle models and addresses key questions to consider at each stage, such as what formats to use, how to document data, where to store it, and how to share and preserve it. The presentation emphasizes making informed decisions about data management and talking to colleagues for support and advice.
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Natsuko Nicholls
The document discusses data sharing policies and mandates from various organizations including federal funding agencies in the US and internationally, journals, and a paradigm shift toward more transparent and collaborative research that integrates publications and data. Key points include requirements for data management plans from NIH and NSF, expectations of funding agencies in other countries to maximize access to research data, a journal policy requiring data to be made available, and challenges around measuring the impact of shared data given the lack of common practices and standards for citing data.
A basic course on Research data management, part 4: caring for your data, or ...Leon Osinski
A basic course on research data management for PhD students. The course consists of 4 parts. The course was given at Eindhoven University of Technology (TUe), 24-01-2017
This slideshow was used in an Introduction to Research Data Management course for the Social Sciences Division, University of Oxford, on 2015-05-27. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
This presentation was delivered at the Elsevier Library Connect Seminar on 6 October 2014 in Johannesburg, 7 October 2014 in Durban and 9 October 2014 in Cape Town and gives an overview of the potential role that librarians can play in research data management
The document provides an overview of the Donders Repository, which aims to securely store original research data, document the research process, and make data accessible to researchers and the public. It describes the procedural design including different roles, collection types, and states. The technical architecture is based on IRODS software and scalable storage. The repository fits into researchers' workflows and supports the timeline of projects from initiation to data sharing. Standards like BIDS help make neuroimaging data FAIR (Findable, Accessible, Interoperable, Reusable).
This document provides an introduction to data management. It discusses why data management is important, covering key aspects like developing data management plans, file organization, documentation and metadata, storage and backup, legal and ethical considerations, sharing and reuse, and preservation. Effective data management is critical for research success as it supports reproducibility, sharing, and preventing data loss. The document outlines best practices and resources like the library that can help with developing strong data management strategies.
University of Bath Research Data Management training for researchersJez Cope
Slides from a workshop on Research Data Management for research staff and students at the University of Bath.
Part of the Research360 project (http://blogs.bath.ac.uk/research360).
Authors: Cathy Pink and Jez Cope, University of Bath
Using Open Science to advance science - advancing open data Robert Oostenveld
This document discusses using open science practices like open data to advance science. It notes the benefits of open data like improved reproducibility and opportunities for data mining. However, sharing neuroimaging and other human subject data presents challenges regarding data size, sensitivity, and privacy regulations. The document promotes using the Brain Imaging Data Structure (BIDS) format to organize data in an open, standardized way. It also discusses the gradient between personal/identifiable data that requires protection and de-identified research data that can be shared, as well as legal constraints and appropriate repositories for sharing data responsibly.
This slideshow was used at a lunchtime session delivered at the Humanities Division, University of Oxford, on 2014-05-12. It provides a general overview of some key data management topics, plus some pointers on where to find further information.
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
A workshop as part of the International Digital Curation Conference 2016 on DMP development and support. This presentation demonstrates how we can use data management plans as a source of information to better understand researcher data stewardship practices and how to support them. Be sure to see the slide notes to better understand the presentation (most slides are just photos/icons).
DataONE Education Module 03: Data Management PlanningDataONE
Lesson 3 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
This document provides an introduction to research data management for geoscience PhD students. It defines research data and different data types. It discusses the importance of managing data throughout its lifecycle for efficient and valid research. It outlines funder requirements, university policies, and activities involved in good research data management like data planning, documentation, storage, sharing and preservation.
Data Literacy: Creating and Managing Reserach Datacunera
This document discusses best practices for creating and managing research data. It covers defining data, the importance of data management, developing a data management plan, file naming conventions, metadata, data sharing and preservation. Key points include making a data management plan addressing types of data, standards, access and sharing policies; using descriptive file names with dates; storing multiple versions of data; and including metadata to explain the data. Resources for data management support are provided.
DataONE Education Module 01: Why Data Management?DataONE
Lesson 1 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Introduction to research data management; Lecture 01 for GRAD521Amanda Whitmire
Lesson 1: Introduction to research data management. From a series of lectures from a 10-week, 2-credit graduate-level course in research data management (GRAD521, offered at Oregon State University).
The course description is: "Careful examination of all aspects of research data management best practices. Designed to prepare students to exceed funder mandates for performance in data planning, documentation, preservation and sharing in an increasingly complex digital research environment. Open to students of all disciplines."
Major course content includes: Overview of research data management, definitions and best practices; Types, formats and stages of research data; Metadata (data documentation); Data storage, backup and security; Legal and ethical considerations of research data; Data sharing and reuse; Archiving and preservation.
See also, "Whitmire, Amanda (2014): GRAD 521 Research Data Management Lectures. figshare. http://dx.doi.org/10.6084/m9.figshare.1003835. Retrieved 23:25, Jan 07, 2015 (GMT)"
The Brain Imaging Data Structure and its use for fNIRSRobert Oostenveld
These slides were prepared for the NIRS toolkit course at the Donders, which due to the Corona crisis has been postponed. The slides present BIDS, explain how fNIRS often involves multiple signals, and relates the two to synchronization and data management
This document discusses subprograms and parameter passing in programming languages. It covers fundamental concepts of subprograms like definitions, calls, headers, and parameters. It then describes different parameter passing methods like pass-by-value, pass-by-reference, and pass-by-name. It also discusses how major languages like C, C++, Java, Ada, C#, and PHP implement parameter passing and type checking.
This document discusses phenomenology and its key concepts. It begins by explaining that phenomenology, which originated with Edmund Husserl, aims to understand phenomena based on how they are experienced rather than external constructs or models. It emphasizes bracketing presuppositions and suspending judgment to see phenomena with an open mind. The document then outlines Husserl's concepts of epoche and eidetic reduction used in phenomenological study. It notes areas of application including social sciences, health sciences, psychology, nursing and education. It concludes by summarizing steps in phenomenological research including identifying a phenomenon, describing experiences of it, and distilling the essence of the shared experience.
Implementation details and performance traits of generics in .NET, Java and C++. Presentation for the Jerusalem .NET/C++ User Group by Sasha Goldshtein.
The document discusses inheritance in C# through an example program. The Child class inherits from the Parent class and overrides the print() method. When an instance of the Child class is created, it first calls the Parent class constructor through the base keyword, then the Child constructor is called. When print() is called on the Child instance, it first calls the base print() method from the Parent class, then calls the Child's print() method.
This document discusses subprograms in programming languages. It covers the fundamentals of subprograms including definitions, parameters, and parameter passing methods. Key points include:
- A subprogram has a single entry point and control returns to the caller when execution terminates. Parameters can be passed by value, reference, result, or name.
- Issues around subprograms include parameter type checking, local variable scope, and parameter passing semantics and implementation. Languages support different parameter passing methods like pass-by-value or pass-by-reference.
- Parameter passing methods have tradeoffs between efficiency and aliasing. Multidimensional arrays as parameters require type information to be passed correctly in some languages. Subprograms can also
This document discusses collections in C#. It defines collections as enumerable data structures that can be accessed using indexes or keys. It then covers the different types of collections in C#, including stacks, queues, lists, and dictionaries. Examples are provided for queues, arrays, arraylists, and dictionaries to illustrate how they can be used.
PL/SQL is a programming language that combines the SQL operations of querying and manipulating data in an Oracle database with the procedural language constructs of variables, conditions, and loops. PL/SQL can be used for database-side programming as well as client-side application development. It provides advantages like better performance, portability, higher productivity, and integration with Oracle. PL/SQL supports various data types, control structures, exception handling, and object-oriented programming features. Cursors allow processing of multiple rows returned from a SQL statement and can be static, dynamic, or reference types. Procedures and functions are reusable program units that allow passing parameters and returning values.
Research Data Management: Part 1, Principles & ResponsibilitiesAmyLN
This two-part course is a collaboration between CU Libraries/Information Services and the Office of Research Compliance & Training. The purpose of this course is to familiarize you with the various aspects of research data management (RDM)
Part 1: Why RDM is both recommended and required
What research data are
Who is responsible for RDM
Part 2:
When RDM activities occur
How you can carry out RDM activities
This document discusses exception handling in Java. It defines exceptions as objects that describe errors during code execution. The try, catch, and finally keywords are used to handle exceptions. Exceptions can be generated by the Java runtime system or manually coded. The try block contains code that could cause exceptions. Catch blocks handle specific exception types. Finally blocks contain cleanup code. All exceptions extend the Throwable class. The Exception class is for program exceptions, while Error is for environmental errors. Uncaught exceptions use the default exception handler.
Logic programming deals with relations rather than functions. It separates logic from control by having the programmer declare facts and relations that are true, while the system determines how to use those facts to solve problems. Horn clauses are used to specify relations, with the consequent stating what is true if the conjunction of antecedents are true. Queries in Prolog can ask if a specific tuple belongs to a relation or if there exists a value for a variable such that a clause is true.
A basic course on Reseach data management, part 2: protecting and organizing ...Leon Osinski
A basic course on research data management for PhD students. The course consists of 4 parts. The course was given at Eindhoven University of Technology (TUe), 24-01-2017
The document discusses generics in C#, explaining that generics allow defining type-safe data structures without committing to actual data types, improving performance and code quality. It covers why generics are required by discussing issues with non-generic stacks, and describes generic type parameters, constraints, methods, delegates and event handling using generics.
Compiler Components and their Generators - Lexical AnalysisGuido Wachsmuth
The document discusses lexical analysis in compiler construction, including an overview of the topics covered such as regular languages represented as regular grammars, regular expressions, and finite state automata. It also discusses the equivalence between these formalisms and techniques for constructing tools for lexical analysis.
The document discusses C# delegates and events. It defines a delegate as a class that encapsulates a method signature and can be used to pass methods as parameters. Delegates allow methods to be assigned and invoked dynamically. Events are a special type of delegate used to define callbacks that are invoked when an event occurs. The document provides examples of singlecast and multicast delegates, declaring and using delegates, and creating a custom delegate and event.
Building Surveys in Qualtrics for Efficient AnalyticsShalin Hai-Jew
Qualtrics® is a state-of-the-art online research suite which enables sophisticated data collection and analytics. This presentation will describe how to build a survey for efficient analytics, both within Qualtrics® and outside Qualtrics®. This presentation emphasizes the importance of thinking through the data collection, the analytics, and the data presentation, in order to build a survey instrument that works for the research context. Along the way, some of the cutting-edge survey-building capabilities of Qualtrics® (including rich question types, invisible questions, branching logic, display logic, panel triggers, and others), will be showcased along with the data analytics functionalities (including cross-tab analysis and data visualizations).
This document provides an overview and instructions for using Oracle's SQL Fundamentals II course. It discusses copyright and usage restrictions, outlines the course objectives and prerequisites. It also lists additional resources for SQL and describes the HR schema used in examples.
Managing and Sharing Research Data: Good practices for an ideal world...in th...Martin Donnelly
This document discusses managing and sharing research data in an ideal versus real world setting. It outlines the agenda which includes an introduction, defining research data management, discussing ethics and integrity, context and policy drivers, incentives for data management, practical considerations, case studies, and concludes with a Q&A. Key points covered include the importance of documentation, metadata, backups, and depositing data long-term. Research data management is important for reproducibility, ethics, and increasingly required by funders and journals.
The document summarizes the Jisc Managing Research Data Programme which aims to support universities in improving research data management. It discusses why managing research data is important, highlighting funder policies and the benefits of open data. It provides an overview of Jisc's activities including training projects, guidance resources, and funding for institutional infrastructure services and repositories. The presentation emphasizes the importance of institutional policies, support services, skills development and cultural change to effectively manage research data in line with funder expectations.
This document discusses managing research data and the benefits of developing a data management plan. It notes that managing research data enables verification, sharing and citation of results. Developing a data management plan structures how data will be created, managed, stored, shared and preserved. The plan should address what data will be created, data management practices, storage and access, and long-term preservation strategies. With good planning, researchers can avoid errors and losing data. The document provides resources for developing plans and getting help with data management.
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)dri_ireland
Presentation given by Martin Donnelly, Senior Institutional Support Officer at the Digital Curation Centre (DCC), as part of the panel session “Digital data sharing: the opportunities and challenges of opening research” at the Digital Humanities conference, Krakow, 15 July 2016. The presentation looks at digital data curation at the DCC.
Managing and Sharing Research Data - Workshop at UiO - December 04, 2017Michel Heeremans
These slides were presented during a workshop on Research Data Management, given at the University of Oslo, Department of Geosciences on December 04, 2017
Research data management during and after your research ; an introduction / L...Leon Osinski
This document outlines a workshop on research data management for PhD students. The workshop covers managing data during research to ensure integrity and allow replication, as well as archiving or publishing data after research. During the workshop, presentations will discuss scientific integrity and data management during research, and data management after research. Discussions will explore topics like dealing with failed experiments, accessibility of data during research, and archiving data after a project is finished. The goal is to provide insight on responsible data practices during and after research.
Aim:- To show how research data management can contribute to the success of your PhD.
*What is research data and why it is important?
*The Research Data lifecycle
* Research Data – more than just your results
* FAIR data and Open Research
* DMP online tool
Paper was presented at European Survey Research Association 2013, in the session Research Data Management for Re-use: Bringing Researchers and Archivists closer.
presented by Stuart Macdonald at the College of Science and Engineering - "What's new for you in the Library“, Murray Library, Kings Buildings, University of Edinburgh. 28 May 2014
Covers research data, research data management, funder policies and the University's RDM policy, RDM services and support, awareness raising, training, progress so far.
Research Data Management: Policy DevelopmentRobin Rice
This document summarizes Robin Rice's presentation on research data management policy development at the University of Edinburgh. It provides an overview of the goals and principles behind Edinburgh's research data management policy, which was the first of its kind in the UK. The policy aims to provide training, support and services to help researchers safely store, share and retain their data in accordance with funder and legal requirements. It establishes responsibilities for principal investigators to create data management plans and make data available and reusable wherever possible.
Open Data: Strategies for Research Data Management (and Planning)Martin Donnelly
The document provides information about facilitating open science training for European research. It discusses the Digital Curation Centre (DCC), which provides guidance and services on research data management and open science. The FOSTER project aims to spread open science practices through training resources, events, and online courses. The presentation then discusses research data management (RDM), including the benefits of managing data according to FAIR principles to make it findable, accessible, interoperable, and reusable. It also covers the importance of developing data management plans (DMPs) to document how research data will be handled and preserved over its lifecycle.
Survey of research data management practices up2010digschol2011heila1
An analysis of data management practices at a large South African university was conducted through interviews with researchers and students to identify needs and challenges. The findings showed that while data collection methods vary, data storage is often ad hoc with no centralized support or resources. Researchers expressed a need for a central university server or repository for secure data storage and assistance with time constraints. It was concluded that a formal research data management program and staff support are needed to improve current practices.
The University of Edinburgh has taken several steps to improve research data management:
1. They developed the first research data management policy in the UK to provide guidelines for storing, sharing, and preserving research data.
2. They created online training and guidance materials called MANTRA to teach researchers best practices for data organization, documentation, and long-term access.
3. They are developing research data services including a data library, repository, and storage strategies to support researchers in managing their data throughout the research lifecycle and ensuring access over time.
Data management planning in the Australian funding landscape by Sarah OlesenMarta Ribeiro
Data management planning in the Australian funding landscape by Sarah Olesen at eResearch Australasia Conference
1.Australian Code for the Responsible Conduct of Research (2007)
2. National Statement on Ethical Conductin Human Research (2007 – updated 2014)
Research Data Management Services at UWA (November 2015)Katina Toufexis
Research Data Management Services at the University of Western Australia (November 2015).
Created by Katina Toufexis of the eResearch Support Unit (University Library).
CC-BY
Session presented by Judith Carr, Research Data Manager at the University of Liverpool on Research Data Management and your PhD.
Aim:- To show how research data management can contribute to the success of your PhD.
Covers:
* What is research data and why it is important?
* The Research Data lifecycle
Research Data – more than just your results
* FAIR data and Open Research
DMP online tool
Data Management for Postgraduate students by Lynn Woolfreypvhead123
This document discusses research data management for postgraduates. It explains that research data management refers to storing, accessing, and preserving research data. It notes that funders and universities now require data management plans for funding proposals and research. The document provides reasons for doing research data management, such as ensuring long-term data preservation, preventing fraud, and enabling data reuse. It outlines elements to include in a data management plan and resources for writing plans. The document advises that data services can help take the burden of research data management off researchers.
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017ARDC
The Australian National Data Service (ANDS) aims to make Australian research data more valuable by partnering with research organizations and funding data projects. In 2015, ANDS conducted over 100 workshops and events with over 4,000 participants and developed online resources. ANDS provides guides on topics like data management and the FAIR data principles. ANDS also advocates for practices like data citation and publishing to ensure research data is preserved and reusable over time. The presentation outlines ANDS' role in supporting good research data management practices and sharing to ensure the integrity and impact of research evidence.
Similar to A basic course on Research data management: part 1 - part 4 (20)
PROOF course Writing articles and abstracts in English, part: Copyright in ac...Leon Osinski
For this presentation students need to have seen 5 web lectures on copyright. During the presentation, the knowledge gained by the students by looking at the web lectures will be tested on the basis of a number of practical questions.
Research data management: course OGO Quantitative research (21-11-2018)Leon Osinski
Research data management involves three key aspects: 1) protecting data through organized file naming and folder structures, 2) sharing data via collaboration platforms or archives to enable reproducibility and reuse, and 3) caring for data through tidy formatting, thorough metadata and documentation, and use of open standards to ensure understandability and usability.
The document discusses the use of Creative Commons licenses for research data. It notes that funders and universities are pushing for open access to research articles and data. However, applying a CC BY license fully transfers copyright to the public domain. For data, researchers must ensure they own the copyright and are authorized to license it. Less restrictive licenses like CC BY-NC still allow commercial reuse with permission. The document debates finding a balance between open access and allowing researchers to control dissemination and potential rewards from their data.
Be open: what funders want you to do with your publications and research dataLeon Osinski
Research funders want researchers to:
1. Publish research articles through open access to make the articles widely available.
2. Deposit the underlying research data in repositories to make the data findable, accessible, interoperable, and reusable (FAIR).
3. Attach open licenses like CC BY to both publications and data to allow for commercial reuse when possible.
A basic course on Research data management, part 3: sharing your dataLeon Osinski
A basic course on research data management for PhD students. The course consists of 4 parts. The course was given at Eindhoven University of Technology (TUe), 24-01-2017
3TU.Datacentrum: presentation for OpenML Workshop (III) at Eindhoven, 22-10-2...Leon Osinski
This document discusses sharing and reusing research data. It explains that sharing data is expected by funders and important for reproducibility, reusing results, and increasing visibility. To be reusable, data should be findable, accessible, intelligible, interoperable, and preserved. The 3TU.Datacentrum assists with assigning DOIs for citation, makes data openly accessible with some embargo options, and ensures long-term preservation. DOIs are assigned through DataCite Netherlands, which research organizations can register with for a fee.
Horizon 2020 and research data : info meeting Horizon 2020 @ TUe, 07-10-2014 ...Leon Osinski
This document discusses research data management (RDM) and the open data pilot program in Horizon 2020. It provides information on why RDM is important, noting key stakeholders that expect data sharing, and how RDM enables data re-use and integrity. The Horizon 2020 open data pilot program is described, including the seven research areas included in the pilot and funder requirements for a Data Management Plan and depositing data in repositories. Guidance and support resources for participating in the open data pilot are also listed.
Copyright and citation issues : PROOF course Writing articles and abstracts /...Leon Osinski
As an author of scholarly papers, you will use in your paper materials (text fragments, picture, tables, figures) of other people. In most cases this material is copyright-protected which means that in most cases, not always, you have to ask permission to re-use that material and to attribute the source of the material. This is also the first topic of this lecture: you as a user of copyright-protected material.
In the second place, when you’re done writing you want to publish your paper in a journal. In most cases, not always, this goes with a transfer of the copyright that you initially own to a publisher. Transfer of copyright has some consequences and this is the second topic of this presentation: you as a producer of copyright-protected material.
Onderzoeksdata-bepalingen van financiers van universitair onderzoek in NL: Ma...Leon Osinski
Onderzoeksdata-bepalingen van financiers van universitair onderzoek in NL : presentatie Master Class Research Data Management in Nederland, Maastricht, 3/4 april 2014.
UKB Werkgroep Datamanagement,Voorwaarden van Financiers.
Maarten van Bentum, Henk van den Hoogen, Leon Osinski
Copyright and your thesis / Leon OsinskiLeon Osinski
This document discusses copyright as it relates to a student thesis. It notes that the student initially has copyright over their writings unless an agreement with their professor or sponsor states otherwise. It also discusses using others' figures, images, or tables in a publication and the need to properly attribute the original author. The document covers open access licenses like CC BY and considerations for publishing the raw data underlying a thesis, such as allowing verification of results, complying with journal requirements, and assigning a DOI to published data.
How to Setup Warehouse & Location in Odoo 17 InventoryCeline George
In this slide, we'll explore how to set up warehouses and locations in Odoo 17 Inventory. This will help us manage our stock effectively, track inventory levels, and streamline warehouse operations.
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxEduSkills OECD
Iván Bornacelly, Policy Analyst at the OECD Centre for Skills, OECD, presents at the webinar 'Tackling job market gaps with a skills-first approach' on 12 June 2024
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPRAHUL
This Dissertation explores the particular circumstances of Mirzapur, a region located in the
core of India. Mirzapur, with its varied terrains and abundant biodiversity, offers an optimal
environment for investigating the changes in vegetation cover dynamics. Our study utilizes
advanced technologies such as GIS (Geographic Information Systems) and Remote sensing to
analyze the transformations that have taken place over the course of a decade.
The complex relationship between human activities and the environment has been the focus
of extensive research and worry. As the global community grapples with swift urbanization,
population expansion, and economic progress, the effects on natural ecosystems are becoming
more evident. A crucial element of this impact is the alteration of vegetation cover, which plays a
significant role in maintaining the ecological equilibrium of our planet.Land serves as the foundation for all human activities and provides the necessary materials for
these activities. As the most crucial natural resource, its utilization by humans results in different
'Land uses,' which are determined by both human activities and the physical characteristics of the
land.
The utilization of land is impacted by human needs and environmental factors. In countries
like India, rapid population growth and the emphasis on extensive resource exploitation can lead
to significant land degradation, adversely affecting the region's land cover.
Therefore, human intervention has significantly influenced land use patterns over many
centuries, evolving its structure over time and space. In the present era, these changes have
accelerated due to factors such as agriculture and urbanization. Information regarding land use and
cover is essential for various planning and management tasks related to the Earth's surface,
providing crucial environmental data for scientific, resource management, policy purposes, and
diverse human activities.
Accurate understanding of land use and cover is imperative for the development planning
of any area. Consequently, a wide range of professionals, including earth system scientists, land
and water managers, and urban planners, are interested in obtaining data on land use and cover
changes, conversion trends, and other related patterns. The spatial dimensions of land use and
cover support policymakers and scientists in making well-informed decisions, as alterations in
these patterns indicate shifts in economic and social conditions. Monitoring such changes with the
help of Advanced technologies like Remote Sensing and Geographic Information Systems is
crucial for coordinated efforts across different administrative levels. Advanced technologies like
Remote Sensing and Geographic Information Systems
9
Changes in vegetation cover refer to variations in the distribution, composition, and overall
structure of plant communities across different temporal and spatial scales. These changes can
occur natural.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
Walmart Business+ and Spark Good for Nonprofits.pdfTechSoup
"Learn about all the ways Walmart supports nonprofit organizations.
You will hear from Liz Willett, the Head of Nonprofits, and hear about what Walmart is doing to help nonprofits, including Walmart Business and Spark Good. Walmart Business+ is a new offer for nonprofits that offers discounts and also streamlines nonprofits order and expense tracking, saving time and money.
The webinar may also give some examples on how nonprofits can best leverage Walmart Business+.
The event will cover the following::
Walmart Business + (https://business.walmart.com/plus) is a new shopping experience for nonprofits, schools, and local business customers that connects an exclusive online shopping experience to stores. Benefits include free delivery and shipping, a 'Spend Analytics” feature, special discounts, deals and tax-exempt shopping.
Special TechSoup offer for a free 180 days membership, and up to $150 in discounts on eligible orders.
Spark Good (walmart.com/sparkgood) is a charitable platform that enables nonprofits to receive donations directly from customers and associates.
Answers about how you can do more with Walmart!"
A basic course on Research data management: part 1 - part 4
1. A basic course on Research data management
part 1: what and why
PROOF course Information Literacy and
Research Data Management
TU/e, 19-09-2017
l.osinski@tue.nl, TU/e IEC/Library
Available under CC BY-SA license, which permits copying
and redistributing the material in any medium or format &
adapting the material for any purpose, provided the original
author and source are credited & you distribute the
adapted material under the same license as the original
2. Research data management [RDM]
what #1
Essence of RDM: “… tracking back to what you did 7
years ago and recovering it (...) immediately in a re-
usable manner.” (Henry Rzepa)
3. Research data management [RDM]
what #2
RDM: caring for your data with the purpose to:
1. protect their mere existence: data loss, data authenticity (RDM basics)
2. share them with others
a. for reasons of reuse: in the same context or in a different context; during
research and after research
b. for reasons of reproducibility checks scientific integrity; data quality
RDM = good data practices1,2,3,4,5,6 that make your data understandable, easy
to work with, and available to other scientists
1. Dynamic ecology (2016), Ten commandments for good data management. https://dynamicecology.wordpress.com/2016/08/22/ten-commandments-for-
good-data-management/
2. Borer, E.T., Seabloom, E.W., Jones, M.B., et al. (2009) Some simple guidelines for effective data management, Bulletin of the Ecological Society of America,
90(2), p. 205-214. doi: 10.1890/0012-9623-90.2.205
3. Hook, L.A., Santhana Vannan, S.K., Beaty, T.W. et al. Best practices for preparing environmental data sets to share and archive. Available online
http://daac.ornl.gov/PI/BestPractices-2010.pdf . doi: 10.3334/ORNLDAAC/BestPractices-2010
4. White, E.P., Baldridge, E., Brym, T. et al. (2013) Nine simple ways to make it easier to (re)use your data, Ideas in Ecology and Evolution, 6(2), p. 1-10. doi:
10.4033/iee.2013.6b.6.f
5. Goodman, A., Pepe, A., Blocker, A.W., et al. (2014) Ten simple rules for the care and feeding of scientific data, PLOS Computional Biology, 10(4),
e10033542. doi: 10.1371/journal.pcbi.1003542
6. Sandve, G.K., et. al. (2013), Ten simple rules for reproducible computational research, PLOS Computational Biology, 9(10), e1003285. doi:
10.1371/journal.pcbi.1003285
4. Source: Research Data
Netherlands / Marina
Noordegraaf
Outline
1. Research data management [RDM]: what and why
a. data management plan
b. discussion
2. Sharing your data, or making your data findable and accessible
a. data protection: back up, file naming, organizing data
b. data sharing: via collaboration platforms, data archives
3. Caring for your data, or making your data usable and interoperable
a. tidy data
b. metadata/documentation
c. licenses
d. open data formats
5. Because you work together with other researchers collaborative science
Because of re-using results: data-driven science open science
Because of scientific integrity: validating data analysis by reproducibility checks
requires data and the code that is used to clean, process and analyze the data and
to produce the final outputs
Additional reasons
Because your data are unique / not easily repeatable
(long term observational data)
Because you benefit from it: increases your visibility and
enhances the trustworthiness / credibility of your
research
Why sharing research data? #1
6. Data sharing is increasingly required by:
+ Journals [here, here, here, here]
+ Professional organizations [VSNU, KNAW]
+ Universities, including TU/e
+ Research funders [NWO, ZonMW, EC]
data management plan
Why sharing research data? #2
because you have to…
7. EC: Horizon 2020 #1
Open research data (ORD) pilot: why?
“The ORD pilot aims to improve and maximise access to and re-use of
research data generated by Horizon 2020…”
“The ORD pilot applies primarily to the data needed to validate the results
presented in scientific publications. Other data can also be provided…”
“A data management plan (DMP) is required for all projects participating in
the extended ORD pilot…”
“Participating in the ORD pilot does not necessarily mean opening up all your
research data. Rather, the ORD Pilot follows the principle “as open as possible,
as closed as necessary” and focuses on encouraging sound data management
as an essential part of research best practice.” (my underlining)
8. EC: Horizon 2020 #2
how? sound research data management
Sound research data management is data management following
the FAIR principles. All research data should be:
Findable: easy to find by both humans and computer systems;
Accessible: stored for long term with well-defined license and access
conditions (open access when possible);
Interoperable: ready to be combined with other datasets by humans as well as
computer systems;
Reusable: ready to be used for future research and to be processed further
using computational methods.
9. Source: Research Data Netherlands /
Marina Noordegraaf
EC: Horizon 2020 #3
requirements
The conditions set by Horizon 2020 with regard to research data
management, come down to two requirements:
1. Formulate a data management plan, and;
2. Deposit research data in a data repository
10. The DMP is a set of questions along the FAIR principles about:
1. What research data sets the project will collect, process and/or generate
2. The handling of these data sets during and after the project
3. Whether and how data sets will be findable/discoverable, re-useable and
shared/made open access
4. How data will be curated and preserved
5. What measures are taken to safeguard and protect (sensitive) data
EC Horizon 2020 #4
data management plan
DMP template Horizon 2020 (via DMPOnline): recommended but voluntary
ZonMw template (via DMP online)
DMP template by 4TU.Centre of Research Data
Examples of H2020 DMPs: http://www.dcc.ac.uk/resources/data-
management-plans/guidance-examples
11. Research data management
discussion topics and questions
Storage and back-up
What sort of data do you use? Are you creating new data or are you working with pre-existing
data?
Where do you store your research data? Is there a back-up? Where?
Are data selections made? Not everything is to be stored but…?
Metadata and documentation (information to let you find, use and understand the data)
Do you describe your research data? Who measured or collected what, when, how? Other
context information?
Are you content with the way you document or describe your research data? Do you succeed
in finding the right (version of your) research data?
Can other researchers understand and (re-)use your research data (during and after
research)? Should they be able to?
Access and re-use
Who can access your research data?
What will happen to your research data when you leave TU/e?
Would you consider publishing your research data, i.e. to make them public available?
12. Research data management
which of these statements is true?
Storage and back-up
1. My research data is stored safely and securely, including regular back ups?
Metadata and documentation
2. I keep metadata with my data: who measured/collected what, when, how
Access and re-use
3. My colleagues are able to access and use my data
4. Other researchers are able to access and use my data
5. My nearest colleagues and I are the only ones who can understand my
data
6. Anyone should be able to use my data when I have finished with it
13. Reasons not to share your data
Preparing my data for sharing takes time and effort
But research data management also increases your research efficiency
My data are confidential
But you can anonymize or pseudonymize your data
My data still need to yield publications
But you can publish your data under an embargo and by publishing your data you
establish priority and you can get credits for it
My data can be misused or misinterpret
But the best defense against malicious use is to refer to an archival copy of your
data which is guaranteed exactly as you mean it to be
My data are only interesting for me
But sharing your data may be required by a funder /
journal or your data may be requested to validate your
results
14. 1. Website IEC/Library [TU/e]: https://www.tue.nl/en/university/library/
2. Figshare support, The importance of data management for research: https://youtu.be/Ae205CNrk6w
3. Henry Rzepa, Collaborative FAIR data sharing: http://www.ch.imperial.ac.uk/rzepa/blog/?p=16292
4. Dynamic ecology (2016), ten commandments for good data management.
https://dynamicecology.wordpress.com/2016/08/22/ten-commandments-for-good-data-management/
5. Borer, E.T., Seabloom, E.W., Jones, M.B., et al. (2009) Some simple guidelines for effective data
management, Bulletin of the Ecological Society of America, 90(2), p. 205-214. doi: 10.1890/0012-9623-
90.2.205
6. Hook, L.A., Santhana Vannan, S.K., Beaty, T.W. et al. Best practices for preparing environmental data sets
to share and archive. doi: 10.3334/ORNLDAAC/BestPractices-2010
7. White, E.P., Baldridge, E., Brym, T. et al. (2013) Nine simple ways to make it easier to (re)use your data,
Ideas in Ecology and Evolution, 6(2), p. 1-10. doi: 10.4033/iee.2013.6b.6.f
8. Goodman, A., Pepe, A., Blocker, A.W., et al. (2014) Ten simple rules for the care and feeding of scientific
data, PLOS Computional Biology, 10(4), e10033542. doi: 10.1371/journal.pcbi.1003542
9. Sandve, G.K., et. al. (2013), Ten simple rules for reproducible computational research, PLOS Computational
Biology, 9(10), e1003285. doi: 10.1371/journal.pcbi.1003285
10. Data sharing increases visibility: http://dx.doi.org/10.7717/peerj.175
11. Data sharing enhances trustworthiness: http://dx.doi.org/10.1371/journal.pone.0026828
URL’s of mentioned webpages
in order of appearance #1
15. 12. Data availability policy journals: http://www.nap.edu/openbook.php?record_id=10613&page=33
13. Data availability policy American Economic Review: https://www.aeaweb.org/aer/data.php
15. Data availability policy PLoS: http://journals.plos.org/plosone/s/data-availability
16. Data availability policy Nature: http://www.nature.com/authors/policies/availability.html
17. VSNU Code of Scientific Conduct (Dutch, revision 2014):
http://www.vsnu.nl/files/documenten/Domeinen/Onderzoek/Code_wetenschapsbeoefening_2004_(2014)
.pdf
18. KNAW responsible research data management: https://www.knaw.nl/en/news/publications/responsible-
research-data-management-and-the-prevention-of-scientific-misconduct?set_language=en
19. Radboud University research data policy: http://www.ru.nl/research-information-services/institutional-
policy/policy-research-data-management/
20. TU/e Code of Scientific Conduct: http://www.tue.nl/en/university/about-the-university/integrity/scientific-
integrity/
21. NWO and research data: http://www.nwo.nl/en/policies/open+science/data+management
21. ZonMW Toegang tot data: https://www.zonmw.nl/en/research-and-results/access-to-data/
22. Horizon 2020 Guidelines on data management:
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-
mgt_en.pdf
URL’s of mentioned webpages
in order of appearance #2
16. 23. About FAIR: Mons, B. et al., Cloudy, increasingly FAIR: revisiting the FAIR Data guiding principles for the
European Open Science Cloud: http://dx.doi.org/10.3233/ISU-170824
24. Template data management plan Horizon 2020: https://dmponline.dcc.ac.uk/
25. ZonMW data management plan template: https://www.zonmw.nl/en/research-and-results/access-to-
data/format-data-management-plan/
26. Data management plan template (4TU.ResearchData): http://researchdata.4tu.nl/en/planning-
research/data-management-plan/
27. Examples of Horizon 2020 data management plans: http://www.dcc.ac.uk/resources/data-management-
plans/guidance-examples
28. Emilio M. Bruna (04-09-2014), The opportunity cost of my #OpenScience was 36 hours + $690 (UPDATED) .
http://brunalab.org/blog/2014/09/04/the-opportunity-cost-of-my-openscience-was-35-hours-690/
28. Rouder, Jeffrey N., The what, why, and how of born-open data, Behavior Research Methods, vol. 48(2016),
p. 1062-1069. http://dx.doi.org/10.3758/s13428-015-0630-z (see p. 1063: “It was a pain to document the
data; it was a pain to format the data”)
URL’s of mentioned webpages
in order of appearance #2
17. A basic course on Research data management
part 2: protecting and organizing
your data
PROOF course Information Literacy and
Research Data Management
TU/e, 07-03-2017
l.osinski@tue.nl, TU/e IEC/Library
Available under CC BY-SA license, which permits copying
and redistributing the material in any medium or format &
adapting the material for any purpose, provided the original
author and source are credited & you distribute the
adapted material under the same license as the original
18. Research data management
Sharing your data, or making your data findable and accessible
with good data practices
→ protecting your data: back up, access control; file naming, organizing
data, versioning
+ sharing your data via collaboration platforms and archives
Caring for your data, or making your data usable and
interoperable with good data practices
+ tidy data
+ metadata/documentation
+ licenses
+ open data formats
Research data management
what was it again
19. Be safe
+ storage, backup data safety, protecting against loss: use local
ICT infrastructure (departmental servers, including SURFdrive) as
much as possible
+ access control data security, protecting against unauthorized
use: with DataverseNL for example
Be organized, or: you (and others) should be able to tell what’s in
a file without opening it
+ file-naming, organizing data in folders, versioning
Protecting your data
good data practices during your research
“…we can copy everything and do not manage it well.” (Indra Sihar)
20. File-naming #1
be consistent and aim for concise but informative names
How you organize and name your files has a big impact on your
ability to find those files later and to understand what they contain.
Good file names are consistent (use file-naming conventions), unique
(distinguishes a file from files with similar subjects as well as different
versions of the file) and meaningful (use descriptive names).
File-naming conventions help you find your data, help others to find
your data and help track which version of a file is most current
Avoid using special characters in a file name: / : * ? < > | [ ] & $
Use hyphens or underscores instead of periods or spaces to
separate logical elements in a file name
Avoid very long names: usually 25 characters is sufficient length
Names should include all necessary descriptive information:
initials researcher, project number, procedure/method…
Names are independent of where it is stored (not the same
names in different folders)
Include dates (format YYYYMMDD) and a version number on files
Add a readme.txt to each folder in which the file naming and its
meaning is explained
Source: Best practices for file naming (Stanford University Libraries)
21. File naming #2
think about the ordering of elements within a filename
Order by date:
2013-04-12_interview-recording_THD.mp3
2013-04-12_interview-transcript_THD.docx
2012-12-15_interview-recording_MBD.mp3
2012-12-15_interview-transcript_MBD.docx
Order by subject:
MBD_interview-recording_2012-12-15.mp3
MBD_interview-transcript_2012-12-15.docx
THD_interview-recording_2013-04-12.mp3
THD_interview-transcript_2013-04-12.docx
Order by type:
Interview-recording_MBD_2012-12-15.mp3
Interview-recording_THD_2013-04-12.mp3
Interview-transcript_MBD_2012-12-15.docx
Interview-transcript_THD_2013-04-12.docx
Forced order with numbering:
01_THD_interview-recording_2013-04-12.mp3
02_THD_interview-transcript_2013-04-12.docx
03_MBD_interview-recording_2012-12-15.mp3
04_MBD_interview-transcript_2012-12-15.docx
<
22. File organization
PAGE 2220-9-2017
Beatriz Ramirez, Data management plan for the PhD project:
development and application of a monitoring system to assess the
impacts of climate and land cover changes on eco-hydrological
processes in an eastern Andes catchment area
Source: Haselager, dr. G.J.T.
(Radboud University Nijmegen);
Aken, prof. dr. M.A.G. van (Utrecht
University) (2000): Personality and
Family Relationships. DANS.
http://dx.doi.org/10.17026/dans-
xk5-y7vc .
23. Organizing your data in folders #1
based on the TIER documentation protocol (http://www.projecttier.org/)
Guiding principles of TIER documentation protocol
1. keep your raw or original data raw
+ save your raw data read-only in its original format in a separate folder
+ make a working copy of your raw data (input data, used for
processing and analysis)
2. keep the command files (files containing code written in the syntax of the
(statistical) software you use for the study) apart from the data
3. keep the analysis files (the fully cleaned and processed data files that you
use to generate the results reported in your paper) in a separate folder
4. store the metadata (codebook, description of variables, etc.) in a separate
folder, apart from the data itself
24. Organizing your data in folders #2
based on the TIER documentation protocol (http://www.projecttier.org/)
1. Main project folder (name of your research project/working title of your
paper)
1.1. Original data and metadata
1.1.1. Original data
1.1.2. Metadata
1.1.2.1. Supplements
1.2. Processing and analysis files
1.2.1. Importable data files
1.2.2. Command files
1.2.3. Analysis files
1.3. Documents
1.4. Literature
25. 1. Main project folder (name of your research project/working title of your
paper)
1.1. Original data and metadata
1.1.1. Original data (raw data, obtained/gathered data)
Any data that were necessary for any part of the processing and/or
analysis you reported in you paper.
Copies of all your original data files, saved in exactly the format it was
when you first obtained it. The name of the original data file may be
changed
Keep these data read only!
1.1.2. Metadata
1.1.2.1. Supplements
Organizing your data in folders #3
based on the TIER documentation protocol
26. 1. Main project folder (name of your research project/working title of your paper)
1.1. Original data and metadata
1.1.1. Original data
1.1.2. Metadata
The Metadata Guide: document that provides information about each of your
original data files. Applies especially to obtained data files
A bibliographic citation of the original data files, including the date you
downloaded or obtained the original data files and unique identifiers that
have been assigned to the original data files.
Information about how to obtain a copy of the original data file
Whatever additional information to understand and use the data in the
original data file
1.1.2.1. Supplements
Additional information about an original data file that’s not written by
yourself but that is found in existing supplementary documents, such as
users’ guides and code books that accompany the original data file
Organizing your data in folders #4
based on the TIER documentation protocol
27. Organizing your data in folders #5
based on the TIER documentation protocol
1. Main project folder (name of your research project/working title of your paper)
1.1. Original data and metadata
1.1.1. Original data
1.1.2. Metadata
1.1.2.1. Supplements
1.2. Processing and analysis files
1.2.1. Importable data files (the data you work with, input data, suitable for
processing and analysis)
A corresponding version for each of the original data files. This version can be identical
to the original version, or in some cases it will be a modified version.
For example modifications required to allow your software to read the file (converting
the file to another format, removing unusable data or explanatory notes from a table)
The original and importable versions of a data file should be given different names
The importable data file should be as nearly as identical as possible to the original
The changes you make to your original data files to create the corresponding
importable data files should be described in a Readme file
1.2.2. Command files
1.2.3. Analysis files
28. Organizing your data in folders #6
based on the TIER documentation protocol
1. Main project folder (name of your research project/working title of your paper)
1.1. Original data and metadata
1.1.1. Original data
1.1.2. Metadata
1.1.2.1. Supplements
1.2. Processing and analysis files
1.2.1. Importable data files
1.2.2. Command files
One or more files containing code written in the syntax of the (statistical) software you use
for the study
Importing phase: commands to import or read the files and save them in a format that
suits your software
Processing phase: commands that execute all the processing required to transform the
importable version of your files into the final data files that you will use in your analysis
(i.e. cleaning, recoding, joining two or more data files, dropping variables or cases,
generating new variables)
Generating the results: commands that open the analysis data file(s), and then
generate the results reported in your paper.
1.2.3. Analysis files
29. Organizing your data in folders #7
based on the TIER documentation protocol
1. Main project folder (name of your research project/working title of your paper)
1.1. Original data and metadata
1.1.1. Original data
1.1.2. Metadata
1.1.2.1. Supplements
1.2. Processing and analysis files
1.2.1. Importable data files
1.2.2. Command files
1.2.3. Analysis files
The fully cleaned and processed data files that you use to generate the
results reported in your paper in your paper
The Data Appendix: codebook for your analysis data files: brief description
of the analysis data file(s), a complete definition of each variable (including
coding and/or units of measurement), the name of the original data files
from which the variable was extracted, the number of valid observations for
the variable, and the number of cases with missing values
30. Organizing your data in folders #8
based on the TIER documentation protocol
1. Main project folder (name of your research project/working title of your paper)
1.1. Original data and metadata
1.1.1. Original data
1.1.2. Metadata
1.1.2.1. Supplements
1.2. Processing and analysis files
1.2.1. Importable data files
1.2.2. Command files
1.2.3. Analysis files
1.3. Documents
An electronic copy of your complete final paper
The Readme-file for your replication documentation
What statistical software or other computer programs are needed to run the
command files
Explain the structure of the hierarchy of folders in which the documentation is
stored
Describe precisely any changes you made to your original data files to create
the corresponding importable data files
Step-by-step instructions for using your documentation to replicate the
statistical results reported in your paper
1.4. Literature
Retrieved relevant literature
31. 1. Storage, back up of data: http://www.data-archive.ac.uk/create-manage/storage
2. Local ICT infrastructure: https://intranet.tue.nl/en/university/services/ict-services/ict-service-
catalog/management-services/data-management-storage/ (TU/e intranet)
3. SURFdrive (at TU/e): https://intranet.tue.nl/en/university/services/ict-services/ict-service-
catalog/management-services/data-management-surfdrive
4. DataverseNL: https://dataverse.nl/dvn/
5. Version control: http://www.data-archive.ac.uk/create-manage/format/versions
6. Best practices for file naming: http://library.stanford.edu/research/data-management-services/data-best-
practices/best-practices-file-naming
8. File organization: Haselager, dr. G.J.T. , Aken, prof. dr. M.A.G. van (2000): Personality and Family
Relationships. DANS. http://dx.doi.org/10.17026/dans-xk5-y7vc (Data guide, p. 24-26)
9. Best practices: file names and folder structures (Leiden example):
http://blogs.library.leiden.edu/researchdata/2016/06/03/best-practices-file-names-and-folder-
structures/#more-284
10. Beatriz Ramirez, Data management plan for the PhD project: development and application of a monitoring
system to assess the impacts of climate and land cover changes on eco-hydrological processes in an
eastern Andes catchment area: http://www.wageningenur.nl/web/file?uuid=3f974938-79a0-421f-b1ad-
95eef49d777c&owner=c057b578-4a6a-4449-881b-17fff17e2f1a (see Figure 1 for folder structure)
11. TIER documentation protocol: http://www.projecttier.org/
URL’s of mentioned webpages
in order of appearance
32. A basic course on Research data management
part 3: sharing your data
PROOF course Information Literacy and
Research Data Management
TU/e, 07-03-2017
l.osinski@tue.nl, TU/e IEC/Library
Available under CC BY-SA license, which permits copying
and redistributing the material in any medium or format &
adapting the material for any purpose, provided the original
author and source are credited & you distribute the
adapted material under the same license as the original
33. Research data management
Sharing your data, or making your data findable and accessible
with good data practices
+ protecting your data: back up, access control; file naming, organizing
data, versioning
→ sharing your data via collaboration platforms and archives
Caring for your data, or making your data usable and
interoperable with good data practices
+ tidy data
+ metadata/documentation
+ licenses
+ open data formats
Research data management
what was it again
34. During research After researchInstitutionDisciplin
Local
ICT
services
Overview research data sharing
and storage services
Data sharing per se is pretty straightforward
35. General data sharing platforms:
SURFdrive [TU/e only]: Dutch academic Dropbox, 100 Gb, maximum data transfer 16 Gb
every TUe employee can use SURFdrive
Google Drive, Dropbox, Beehub…
DataverseNL [TU/e only]: data sharing platform for active research data [based on Harvard’s
Dataverse Project] where you may:
store your data in an organized and safe way
clearly describe your data
version control of your data
arrange access to your data
get recognition for your data
[collaborate on your data]
Various disciplinary initiatives: Open Science Framework, OpenML, RodRep, CRCNS…
SURF Filesender [secure data transfer up to 500 Gb!, WeTransfer up to 2 Gb]
Sharing your data
collaboration or sharing platforms (during your research)
Storage and backup of data through DANS [Dutch
Archiving and Networking Services]
Data transfer: up to 2 Gb per dataset
Dataverse via 4TU.ResearchData: up to 50 Gb free
36. How to create an account:
Go to: https://dataverse.nl/
Click ‘Log in’ (at the top right); under Institutional account click SURFconext
Select Eindhoven University of Technology and log on with your TU/e username and
password
When asked for it, give permission to share your data by answering Yes or click this
Tab
When asked to create an account, answer Yes or click this Tab.
When you succeeded to create an account, your username is the prefix of your
email address
You now have a user account with DataverseNL.
If you click 4TU dataverse Eindhoven dataverse Add data you can create and
publish data sets, upload files and assign access rights to data sets or files.
However, before you proceed, contact me (for more options) or first use the demo
version: https://act.dataverse.nl
Sharing your data
DataverseNL
If you are interested in using DataverseNL, please contact me (Leon Osinski)
37. On request
“I'd like to thank E.J. Masicampo and Daniel LaLande for sharing and allowing me to share
their data…”
Daniël Lakens (2014), What p-hacking really looks like: A comment on Masicampo & LaLande (2012)
On a (personal) website
“Let me start by saying that the reason why I put all excel files online, including all the
detailed excel formulas about data constructions and adjustments, is precisely because I
want to promote an open and transparent debate about these important and sensitive
measurement issues.”
Thomas Piketty, My response to the Financial Times, HuffPost The Blog, 29-05-2014 ;
originally published as Addendum: Response to FT, 28-05-2014
A data journal
Journal of open psychology data, Geoscience data journal,
Data in brief, Scientific data, Data reports
Sharing your data
after your research has ended
Source: www.aukeherrema.nl
38. Choose a repository where other researchers in your discipline are sharing their data, for example
LXcat (for plasma data), TurBase (for turbulence data) or GenBank (for genetic sequence data)
Overview of research data repositories: Re3data.org
Use a repository that at least assigns a persistent identifier to your data (DOI) and requires that
you provide adequate metadata
General or multidisciplinary repositories: Zenodo, Figshare, DANS, Dryad, B2SHARE
4TU.ResearchData
+ small medium sized data sets, long tail data
+ static data, ‘frozen’ data sets, ‘milestone’ data sets
+ preferably nonproprietary data formats suitable for long term preservation
+ DOI’s [ persistent identifier for citability and retrievability ]
+ open access
+ long-term availability, Data Seal of Approval
+ Data Citation Index (Thomson Reuters)
+ self-upload (single data sets < 3Gb)
+ special collections of related data sets
Sharing your data
after your research has ended, by publishing and archiving them in an established
repository
39. Link your data to your publication
Sharing your data
link your data to your publication
40. 1. Overview research data storage and sharing services: http://dataservices.silk.co/
2. DataverseNL: https://www.dataverse.nl/dvn/
3. Harvard’s Dataverse Project: http://dataverse.org/
4. Open Science Framework: https://cos.io/osf/
5. OpenML: http://www.openml.org
6. RodRep: http://www.rodrep.com/
7. CRCNS: http://crcns.org/
8. SURFdrive: https://www.surfdrive.nl/
9. Google Drive: https://www.google.com/drive/
10. Dropbox: https://www.dropbox.com/
11. Beehub: https://beehub.nl/system/
12. SURF filesender: https://filesender.surfnet.nl/
12. Data on request (blog post Daniel Lakens): http://daniellakens.blogspot.nl/2014/09/what-p-hacking-really-
looks-like.html
13. Data on personal website (Thomas Piketty): http://piketty.pse.ens.fr/en/capital21c2
14. Overview of (better known) data journals: http://proj.badc.rl.ac.uk/preparde/blog/DataJournalsList
URL’s of mentioned webpages
in order of appearance #1
41. 15. Data journal: Journal of Open Psychology Data: http://openpsychologydata.metajnl.com/
16. Data journal: Geoscience Data Journal: http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)2049-6060
17. Data journal: Data in brief: http://www.journals.elsevier.com/data-in-brief
18. Data journal: Scientific data: http://www.nature.com/sdata/
19. Data journal: Data reports: http://www.frontiersin.org/news/Data_Reports_a_new_type_of_peer-
reviewed_article_in_Frontiers_journals/1051?utm_source=FRN&utm_medium=ECOM&utm_campaign=T
WT_FRN_1502_datareport
20. Research data catalogue: Re3data.org: http://service.re3data.org/search/results?term=
21. Publishing data: Zenodo: http://www.zenodo.org/
22. Publishing data: Figshare: http://www.figshare.com
23. Publishing data: DANS: http://www.dans.knaw.nl/en
23. Publishing data: Dryad: http://datadryad.org/
24. Publishing data: B2SHARE: https://b2share.eudat.eu/
25. Publishing data: 4TU.ResearchData: https://data.4tu.nl/
26. Long tail research data: http://www.nature.com/neuro/journal/v17/n11/fig_tab/nn.3838_F1.html
URL’s of mentioned webpages
in order of appearance #2
42. 27. Preferred data formats 4TU.ResearchData: http://researchdata.4tu.nl/en/publishing-research/data-
description-and-formats/
28. Data Seal of Approval: http://www.datasealofapproval.org
29. Data Citation Index (Thomson Reuters): http://wokinfo.com/products_tools/multidisciplinary/dci/
30. Self upload 4TU.ResearchData: https://data.4tu.nl/account/login/?next=/upload/
31. Data sets underlying PhD thesis Joos Buijs: http://dx.doi.org/10.4121/uuid:26aba40d-8b2d-435b-b5af-
6d4bfbd7a270
32. PhD thesis Joos Buijs: http://dx.doi.org/10.6100/IR780920
URL’s of mentioned webpages
in order of appearance #3
43. A basic course on Research data management
part 4: caring for your data, or
making data usable
PROOF course Information Literacy and
Research Data Management
TU/e, 07-03-2017
l.osinski@tue.nl, TU/e IEC/Library
Available under CC BY-SA license, which permits copying
and redistributing the material in any medium or format &
adapting the material for any purpose, provided the original
author and source are credited & you distribute the
adapted material under the same license as the original
44. Research data management
Sharing your data, or making your data findable and accessible
with good data practices
+ protecting your data: back up, access control; file naming, organizing
data, versioning
+ sharing your data via collaboration platforms and archives
→ Caring for your data, or making your data usable and
interoperable with good data practices
+ tidy data
+ metadata/documentation
+ licenses
+ open data formats
Research data management
what was it again
Before data can be reusable, it has first to be usable
45. Tidy data is about structure of a table / data set.
Tidy data ≠ clean data. It’s a step towards clean data
+ Each variable you measure is in one column
+ Column headers are variable names
+ Each observation is in a different row
+ Every cell contains only one piece of information
Tidy data
making your data easy to handle for computers
46. Tidy data allow your data to be easily:
+ imported by data management systems
+ analyzed by analysis software
+ visualized, modelled, transformed
+ combined with other data (interoperability)
Tidy data
why
47. Tidy data versus messy data
1. More than one variable in a
single column (‘clumped data’)
2. Column headers are values, or:
one variable over many columns
(‘wide data’)
3. Variables are in rows and
columns
4. More pieces of information in
one cell (cells are highlighted or
colored; values and
measurement units in one cell)
1. Each variable you measure
is in one column
2. Column headers are
variable names
3. Each observation is in a
different row
4. Every cell contains only one
piece of information
Tidy data Messy data
48. patient_id drug_a drug_b
1 67 56
2 80 90
3 64 50
4 85 75
Tidy data versus messy data
example
‘Wide’ data: one variable
over many columns Tidy data
patient_id drug heart_rate
1 a 67
2 a 80
3 a 64
4 a 85
1 b 56
2 b 90
3 b 50
4 b 75
49. What is the nature of the “unusual episode” to which this table refers?
50. What is the nature of the “unusual episode” to which this table refers?
Different columns contain
measurements of the same variable:
easier to read and interpret but
difficult to add data (columns) to the
records (rows)
51. Class Sex Age Survived Freq
1 1st Male Child No 0
2 2nd Male Child No 0
3 3rd Male Child No 35
4 Crew Male Child No 0
5 1st Female Child No 0
6 2nd Female Child No 0
7 3rd Female Child No 17
8 Crew Female Child No 0
9 1st Male Adult No 118
10 2nd Male Adult No 154
11 3rd Male Adult No 387
12 Crew Male Adult No 670
13 1st Female Adult No 4
14 2nd Female Adult No 13
15 3rd Female Adult No 89
16 Crew Female Adult No 3
17 1st Male Child Yes 5
18 2nd Male Child Yes 11
19 3rd Male Child Yes 13
20 Crew Male Child Yes 0
21 1st Female Child Yes 1
22 2nd Female Child Yes 13
The same data in a tidy structure (variables
in columns and observations in rows)
“The problem is that people like to view data in a totally different way than
a computer likes to process it.” (Kien Leong)
52. Tools for tidying data
OpenRefine
download OpenRefine: http://openrefine.org/download.html
runs on your computer (not in the cloud), inside the Firefox browser (not in
IE), no web connection is needed
captures all steps done to your raw data ; original dataset is not modified;
steps are easily reversed;
R, TidyR package
scripted language (R (free), Matlab, SAS…) to process data (tidying,
cleaning, etc.), run the analysis and to produce final outputs
versus
Excel: data provenance and documentation of data processing with a
graphical user interface is bad because it doesn’t leaves a record
53. The table or data set itself
+ columns: use clear, descriptive variable names (no hard to
understand abbreviations), avoid special characters (can cause
problems with some software)
+ rows: if possible, use standard names within cells (derived
from a taxonomy, for example: standard species name, CAS
registry for chemical substances, standard date formats, …)
+ try to avoid coding categorical or ordinal data as numbers
+ missing data: use NA
Documentation / metadata
making your data understandable for humans #1
54. The table or data set as a whole
A description (documentation) that at least mentions:
+ size of the data set: number of observations and variables
+ information about the variables and its measurement units
(code book)
+ what’s included and excluded in the data set, why data are
missing
+ description of how you collected the data (study design), data
manipulation steps (provenance)
+ when your data consists of multiple files organized in a folder
structure, an explanation of the structure and naming of the
files
Documentation / metadata
making your data understandable for humans #2
“Research outputs that are poorly documented are like canned goods with the label
removed (…)” (Carly Strasser)
55. Documentation / metadata
metadata standards
Sometimes there are metadata standards for the
documentation of your data set but where no standard
exists, a simple readme file can be good enough
57. 1. Morphological
Measurements of Galapagos
Finches
http://dx.doi.org/10.5061/dry
ad.152
Use of standard names
(taxonomy, species)
Variable names clear
enough? WingL must be
wing length but what is
N.Ubkl?
Units of measurement?
Based on:
Looking after datasets / by
Antony Unwin, 01-09-2015,
http://blog.revolutionanalytics
.com/2015/09/looking-after-
datasets.html
58. Documentation / metadata
making your data findable for humans and search engines
Descriptive metadata for discovery and identification of
your data mainly
+ creator
+ title
+ short description + key words
+ date(s) of data collection
+ publication year
+ related publications
+ DOI (assigned by data archive)
+ etc.
When uploading your
data in a data archive
like 4TU.ResearchData,
you will be asked to
enter these metadata
A DOI is assigned by
the data archive
59. User license
making clear that other people are allowed to use your data
Let other people know in advance what they are
allowed to do with your data by attaching a user license
to it
+ Creative Commons license for data sets
+ GNU General Public License (GPL) for software
+ License selector
60. Open data formats
ensuring the ‘longevity’ of your data
+ with open (non-proprietary) data formats it is best
ensured that the data will remain usable and ‘legible’
for computers in the future
+ are easy to use in a variety of software, like .csv for
tabular data
+ check the data formats that are supported by a data
archive like 4TU.ResearchData
61. Usable data
recommended reading
These 3 papers give a good summary of this module
+ Eugene Barsky (2017), Good enough research data
management: a very brief guide
+ Shannon E. Ellies, Jeffrey T. Leek (2017), How to share
data for collaboration
+ Greg Wilson, et al. (2017), Good enough practices in
scientific computing
62. Data Coach [ website ]
TU/e data librarians (rdmsupport@tue.nl)
Leon Osinski, Sjef Öllers
Recommended reading
Van den Eynden, Veerle e.a. (2011), Managing and sharing data: best
practice for researchers, UK Data Archive
Strasser, Carly (2015), Research data management, NISO
Recommended online course
Essentials 4 data support [English & Dutch]
Support
63. 1. Tidy data: https://www.jstatsoft.org/article/view/v059i10
2. The “Unusual Episode Data“ revisited:
https://www.amstat.org/publications/jse/v3n3/datasets.dawson.html
3. OpenRefine: http://openrefine.org
4. TidyR: http://tidyr.tidyverse.org/
5. R: https://www.r-project.org/
6. Metadata standards: http://rd-alliance.github.io/metadata-directory/
7. Raw Titanic data: https://www.amstat.org/publications/jse/datasets/titanic.dat.txt
8. Documentation to Titanic data: https://www.amstat.org/publications/jse/datasets/titanic.txt
9. Morphological Measurements of Galapagos Finches: http://dx.doi.org/10.5061/dryad.152
10. Looking after data sets: http://blog.revolutionanalytics.com/2015/09/looking-after-datasets.html
11. Descriptive metadata 4TU.ResearchData: http://researchdata.4tu.nl/en/publishing-research/uploading-
data/
12. Creative Commons licenses: https://creativecommons.org/
13. GNU General Public License: https://www.gnu.org/licenses/gpl-3.0.en.html
URL’s of mentioned webpages
in order of appearance #1
64. 14. License selector: https://ufal.github.io/public-license-selector/
15. Preferred data formats of 4TU.ResearchData: http://researchdata.4tu.nl/en/publishing-research/data-
description-and-formats/
16. Eugene Barsky (2017), Good enough research data management: a very brief guide
17. Shannon E. Ellies, Jeffrey T. Leek (2017), How to share data for collaboration
18. Greg Wilson, et al. (2017), Good enough practices in scientific computing
19. TU/e Data Coach: http://www.tue.nl/datacoach
20. Van den Eynden, Veerle e.a. (2011), Managing and sharing data: best practice for researchers, UK Data
Archive
21. Carly Strasser, Research data management:
http://www.niso.org/apps/group_public/download.php/15375/PrimerRDM-2015-0727.pdf
22. Online course ‘Essentials for data support’: http://datasupport.researchdata.nl/en/
URL’s of mentioned webpages
in order of appearance #2
Editor's Notes
Introducing myself and IEC/Library
Question: what do you think f this video? What is, according to Henri Rzepa, the essence of research data management? Does that corresponds with your idea of RDM?
Besides sharing your data for re-use, RDM is also about reproducibility.
To do a reproducibility check of your results, raw data are needed and an overview of all the steps you have done with your data (from raw data to cleaned data to processed data to analysed data to published data – the figures and tables in your paper). Where did the data come from?
Re-use of data is future oriented; reproducibility is past oriented; is about the quality of your data.
This course is especially about 1: making data available to others data sharing requires research data management!
RDM is especially about data sharing, not only after your research but also during your research. Your promotor wants to take quick look at your data, your colleague needs some of your data, etc.
Quality control or quality assurance of your data: a. protecting against data loss ; b. protecting data authenticity (ensuring that data has not changed after its creation)
Open data = full provenance of where the data comes from + clear copyright statements licences and/or waiver (Egon Willighagen)
RDM kent een kennisdelingskant (sharing) en een activiteitenkant (caring)
Sharing = via archivering/preservering vooral
Caring = data usable en traceable maken
Scientific integrity/reproducibility: how did you arrive at your results? Tracing from final outputs like a graph or a figure to the original raw data set
Quality control or quality assurance of your data: a. protecting against data loss ; b. protecting data authenticity (ensuring that data has not changed after its creation)
This course is especially about 1: making data available to others data sharing requires research data management!
RDM is especially about data sharing, not only after your research but also during your research. Your promotor wants to take quick look at your data, your colleague needs some of your data, etc.
The definition or description of RDM leads to the topics of this training
The first three reasons follow from the description or definition of RDM
“Toegang tot ruwe data is belangrijk voor vervolgonderzoek, replicatieonderzoek en integriteitsonderzoek.”
During your research: RDM data sharing ; also merging another person’s data with your data allows collaboration
But: [DMP Driessen]: “There is nothing worse than dig through the data of someone else”! So, be clear, use standard / quality metadata. Don’t think: only I have to understand it.
After your research: doesn’t necessarily mean open access; open access when possible; usability of data precedes openness of data
Because data are an asset, worth sharing in order to be reused or built on by others: data-driven science: progress of science not only by building on the same data but especially by combining or merging data from different sources. Not all data are useful for re-use ;
Because data provides the evidence for a published paper; data can be asked for by others in view of verifying or replicating your results (scientific integrity). Validating results by replicating them asks for data. UPSIDE: Uniform Principle of Sharing Integral Data and Materials Expeditiously. Reproducibility = being able to go from data to figures/results!
Because data are unique and/or valuable
4 kinds of data:
observational data (from sensors but also from surveys or field counts): one-time phenomena. In many cases these data cannot be replicated and should therefore be retained
experimental data: data from clinical trials, pharmaceutical testing, psychological experiments but also from high-throughput machines like an accelerator. In some cases it is not feasible or ethical to replicate data collecting. Preservation is particularly important for these experimental data
computational data generated from (computational) simulations. Data can be regenerated by rerunning the simulation. Nevertheless, preservation over the medium term can be needed: for subsequent analysis (visualization, data mining) and computer time for very large-scale computations can be expensive and/or not available within a short time frame
[ reference data sets: mapping the human genome, documenting proteins, longitudinal data on economic and social status ]
Trustworthiness and credibility of science: “We found the reluctance to share data to be associated with weaker evidence (against the null hypothesis of no effect) and a higher prevalence of apparent errors in the reporting of statistical results.” The weaker the evidence, the greater the reluctance to share data ; the stronger the evidence, the more willingness to share data.
Onderzoeksdata worden minder snel gedeeld als de bewijsvoering zwak is; sneller gedeeld als het bewijs sterk is.
Because journals, funders, universities or code of conducts demand data to be accessible and reusable.
If research funders set conditions with regard to data management, this often comes down to the requirement of a data management plan.
‘Take measures’ = best effort, inspanningsverplichting
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf [Guidelines on data management in Horizon 2020 ]
Open research data pilot: ook hergebruik van data ; vooral ingevuld door een DMP [ DMP as an early deliverable within the first six months of the project ]
Scope: 7 areas of Horizon 2020 ; €3 billion [ 20% of the overall Horizon 2020 budget 2014-2015 ]
Future and emerging technologies
Research infrastructures – part e-infrastructures
Leadership in enabling and industrial technologies – Information and communication technolgies
Societal challenge: ‘Secure, clean and efficient energy’ – part Smart cities and communities
Societal challenge: ‘Climate, action, environment, resource efficiency and raw materials’ – except raw materials
Societal challenge: ‘Europe in a changing world – inclusive, innovative and reflective societies’
Science with and for society
At the proposal submission stage, the information provided is not part of the evaluation.
Costs relating to the implementation of the pilot will be eligible
3054 proposals: opt out core areas = 24% ; opt in in other areas = 27%
Guidelines on open access to scientific publications and research data in Horizon 2020 (version 1.0, 11 December 2013)
Guidelines on data management in Horizon 2020 (version 1.0, 11 december 2013): open research data pilot
Open research data pilot / Data management plan [ DMP ]
What types of data will the project generate/collect?
What standards will be used?
How will this data be exploited and/or shared/made accessible for verification and re-use? If data cannot be made available explain why
How will this data be curated and preserved?
FAIR data is what Horizon 2020 wants!
FAIR data implies sound research data management. Research data management prepares for FAIR
Why is sound research data management important? Because it is a key conduit to FAIR data.
FAIR data leads to knowledge discovery and innovation
Single DMP for your project to cover its overall approach. However, where there are specific issues for individual datasets (e.g. regarding openness), you shoulds clearly spell this out
The DMP is a set of questions about:
The handling of research data during and after the project
What data sets the project will collect, process and/or generate
Whether and how the data sets will be shared/made open access
How data will be curated and preserved
What measures are taken to safeguard and protect sensitive data
Citaat uit: H2020 programme, Guidelines on FAIR datamanagement in Horizon 2020, version 3.0 (26 July 2016), p. 5
“A data management plan (DMP) is required for all projects participating in the extended ORD pilot…”
Participating projects will be required to develop a Data Management Plan (DMP), in which they will specify what data will be open: detailing what data the project will generate, whether and how it will be exploited or made accessible for verification and re-use, and how it will be curated and preserved.
The DMP needs to be updated over the course of the project whenever significant changes arise, such as (but not limited to):
New data
Changes in consortium policies (new innovation potential, decision to file for a patent)
Changes in consortium composition and external factors (new mebers joining or old members leaving)
The DMP should be updated as a minimum in time with the periodic evaluation/assessment of the project
File naming, organizing data, versioning: gaat over het door jezelf kunnen terugvinden van je data. Als een data file niet meer teruggevonden kan worden is deze ‘verloren’.
This is also about find your own data yourselves!
Be organized: design naming schemes for your files and folders
Data classification and retention: see DMP Indra Sihar
Data classification and retention: if not used, then the data volumes and its costs will grow autonomously and are out-of-control
When will what data no longer be useful and can be discarded?
Maintaining the integrity of data: this implies protecting the mere existence of data, maintaining quality of data and ensuring that data are accessed only by those authorized to do so.
RDM consists of these parts.
minimize the risk of data loss or deletion ;
protect your data from unauthorized use ;
use the correct data. Especially when you edit your data often or collect data through various experiments or tests, identifying the correct data may pose a problem ;
RDM enhances the efficiency of your research.
Avoid using special characters because data files can be used for a script!
File names should be descriptive, reflect the content and unique (independent of where – in which folder - it is stored.
Zie ook (nog verwerken): http://blogs.library.leiden.edu/researchdata/2016/06/03/best-practices-file-names-and-folder-structures/#more-284
“The first step in making a research project reproducible is to make sure that the files are associated with it are organized”
[https://tomwallis.info/2014/01/16/setting-up-a-project-directory/]
See also: https://nicercode.github.io/blog/2013-04-05-projects/ en http://dx.doi.org/10.1371/journal.pcbi.1000424
Organizational scheme on the basis of file formats: should all .csv files be grouped together?
gebruikersnaam: losinski
wachtwoord: srm1248
Other ‘protocols’: https://tomwallis.info/2014/01/16/setting-up-a-project-directory/
Misschien in deze folder (Analysis files) een subfolder voor de visualisaties (figuren, tabellen) zelf.
Raw data cannot be understood and needs processing; processing gives meaning to raw data,
https://twitter.com/TrevorABranch/status/648987799648014336 : “My rule of thumb: every analysis you do on a dataset will have to be redone 10–15 times before publication. Plan accordingly”
Introducing myself and IEC/Library
File naming, organizing data, versioning: gaat over het door jezelf kunnen terugvinden van je data. Als een data file niet meer teruggevonden kan worden is deze ‘verloren’.
Dataverse Network: 2 Gb
Informal peer-to-peer sharing makes it difficult to know which data can be obtained where, requires the right contact, makes managing data access a burden and does not ensure availability of the data in the long-term.
Project websites can offer easy immediate storage and dissemination, but will offer less sustainability and it is difficult to control who uses your data and how they use it unless administrative procedures are in place.
4TU.RD is not about storage!
TurBase nog vermeden in de slide
Figshare: free till 1 Gb
DANS: Dutch, social sciences and humaniora
Dryad: not free (90 euro for 10 Gb), only data underlying publications
Who knows DOI’s?
Are these data complete? What is missing?
Other = crew
N.Ubkl: distance Nose to Upperbeak = Maxilla = bovenkaak
Fig S2 als voorbeeld (click Save and then Open)
One piece of information:
Niet in 1 cel: kolom Adres: Piuslaan 50, 5614 CM Eindhoven maar in aparte kolommen (cellen): Huisnummer: 50; Straat: Piuslaan; Stad: Eindhoven; Postcode: 5614CM.
In Excel: constrain data entries: zodat alleen namen uit een taxonomie ingevoerd kunnen worden ;
Categorische of ordinale data: value for sex should be male or female, use ‘poor’, ‘fair’, ‘good’ not 1, 2, 3