This document discusses managing and sharing research data in an ideal versus real world setting. It outlines the agenda which includes an introduction, defining research data management, discussing ethics and integrity, context and policy drivers, incentives for data management, practical considerations, case studies, and concludes with a Q&A. Key points covered include the importance of documentation, metadata, backups, and depositing data long-term. Research data management is important for reproducibility, ethics, and increasingly required by funders and journals.
Research data management: a tale of two paradigms: Martin Donnelly
Presentation I was supposed to give at "Scotland’s Collections and the Digital Humanities" workshop in Edinburgh on May 2nd 2014. Illness prevented it, but my heroic DCC colleague Jonathan Rans stepped up and delivered the presentation on my behalf.
Research data management: a tale of two paradigms: Martin Donnelly
Presentation I was supposed to give at "Scotland’s Collections and the Digital Humanities" workshop in Edinburgh on May 2nd 2014. Illness prevented it, but my heroic DCC colleague Jonathan Rans stepped up and delivered the presentation on my behalf.
Slides from keynote lecture by Andrew Prescott to the 7th Herrenhausen conference of the Volkswagen Foundation, 'Big Data in a Transdisciplinary Perspective'
Open science curriculum for students, June 2019Dag Endresen
Living Norway seminar on Open Science in Trondheim 12th June 2019.
https://livingnorway.no/2019/04/26/living-norway-seminar-2019/
https://www.gbif.no/events/2019/living-norway-seminar.html
presented by Stuart Macdonald at the College of Science and Engineering - "What's new for you in the Library“, Murray Library, Kings Buildings, University of Edinburgh. 28 May 2014
Covers research data, research data management, funder policies and the University's RDM policy, RDM services and support, awareness raising, training, progress so far.
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...Jisc
Universities and researchers need to be able to manage research data effectively to fulfil research funders requirements and ultimately to contribute to research excellence. UK universities are comparatively well advanced in what is a global challenge, but none the less there needs to be further advances in university policy, technical and support services. This session will share best practice in research data management and information about key tools that can help to develop university solutions; and it will also inform participants about the latest Jisc initiatives to help build university research data services and shared services.
Liberating facts from the scientific literature - Jisc Digifest 2016Jisc
Text and data mining (TDM) techniques can be applied to a wide range of materials, from published research papers, books and theses, to cultural heritage materials, digitised collections, administrative and management reports and documentation, etc. Use cases include academic research, resource discovery and business intelligence.
This workshop will show the value and benefits of TDM techniques and demonstrate how ContentMine aims to liberate 100,000,000 facts from the scientific literature, and ContentMine will provide a hands on demo on a topical and accessible scientific/medical subject.
Research Data Management Services at UWA (November 2015)Katina Toufexis
Research Data Management Services at the University of Western Australia (November 2015).
Created by Katina Toufexis of the eResearch Support Unit (University Library).
CC-BY
Slides from keynote lecture by Andrew Prescott to the 7th Herrenhausen conference of the Volkswagen Foundation, 'Big Data in a Transdisciplinary Perspective'
Open science curriculum for students, June 2019Dag Endresen
Living Norway seminar on Open Science in Trondheim 12th June 2019.
https://livingnorway.no/2019/04/26/living-norway-seminar-2019/
https://www.gbif.no/events/2019/living-norway-seminar.html
presented by Stuart Macdonald at the College of Science and Engineering - "What's new for you in the Library“, Murray Library, Kings Buildings, University of Edinburgh. 28 May 2014
Covers research data, research data management, funder policies and the University's RDM policy, RDM services and support, awareness raising, training, progress so far.
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...Jisc
Universities and researchers need to be able to manage research data effectively to fulfil research funders requirements and ultimately to contribute to research excellence. UK universities are comparatively well advanced in what is a global challenge, but none the less there needs to be further advances in university policy, technical and support services. This session will share best practice in research data management and information about key tools that can help to develop university solutions; and it will also inform participants about the latest Jisc initiatives to help build university research data services and shared services.
Liberating facts from the scientific literature - Jisc Digifest 2016Jisc
Text and data mining (TDM) techniques can be applied to a wide range of materials, from published research papers, books and theses, to cultural heritage materials, digitised collections, administrative and management reports and documentation, etc. Use cases include academic research, resource discovery and business intelligence.
This workshop will show the value and benefits of TDM techniques and demonstrate how ContentMine aims to liberate 100,000,000 facts from the scientific literature, and ContentMine will provide a hands on demo on a topical and accessible scientific/medical subject.
Research Data Management Services at UWA (November 2015)Katina Toufexis
Research Data Management Services at the University of Western Australia (November 2015).
Created by Katina Toufexis of the eResearch Support Unit (University Library).
CC-BY
An illustrated guide to microservices (boston python meetup - Aug 2016)Ambassador Labs
This talk will explain key principles, costs, and benefits of microservices and
show via a series of live demos and working examples how to extend a
Python application to quickly and easily benefit from migrating to a
microservices architecture.
패치해야할 서버가 전 세계에 나뉘어져 있다면, 어떻게 해야 동시에, 빠르면서, 또 신뢰성있게 서버를 패치할 수 있을까?이를 구현하기 위해 개발된 1)작은 크기의 패치데이터를 빠르게 생성하는 알고리즘과 2) 글로벌 데이터 복제를 위한 기술, 3) 동시 패치와 롤백이 가능하도록 구성한 시스템의 설계와 구조에 대해서 알아본다. 또, 핵심기능에 대한 시연과 함께넥슨아메리카에서 실제로 이를 어떻게 활용하고 있는지, 현장에서 얼마만큼의 개선이 이루어졌는지를 실증적 데이터에 기반하여 공유하고자 한다.
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)dri_ireland
Presentation given by Martin Donnelly, Senior Institutional Support Officer at the Digital Curation Centre (DCC), as part of the panel session “Digital data sharing: the opportunities and challenges of opening research” at the Digital Humanities conference, Krakow, 15 July 2016. The presentation looks at digital data curation at the DCC.
A basic course on Research data management, part 1: what and whyLeon Osinski
A basic course on research data management for PhD students. The course consists of 4 parts. The course was given at Eindhoven University of Technology (TUe), 24-01-2017
Immersive informatics - research data management at Pitt iSchool and Carnegie...Keith Webster
A joint presentation by Liz Lyon and Keith Webster on providing education for librarians engaged in research data management. This was delivered at Library Research Seminar VI, at the University of Illinois Urbana Champaign in September 2014. The presentation looks at a class delivered by Lyon at the University of Pittsburgh's iSchool in 2014, and the related needs for immersive training opportunities amongst experienced practicing librarians, using Carnegie Mellon University's library, led by Webster, as a case study.
A basic course on Research data management: part 1 - part 4Leon Osinski
Slides belonging to a basic course on research data management. The course consists of 4 parts:
Part 1: what and why
1.1 data management plans
Part 2: protecting and organizing your data
2.1 data safety and data security
2.2 file naming, organizing data (TIER documentation protocol)
Part 3: sharing your data
3.1 via collaboration platforms (during research)
3.2 via data archives (after your research)
Part 4: caring for your data, or making data usable
4.1 tidy data
4.2 documentation/metadata
4.3 licenses
4.4 open data formats
Libraries and Research Data Management – What Works? Lessons Learned from the...LIBER Europe
This presentation by Dr Birgit Schmidt was given at the Scholarly Communication and Research Infrastructures Steering Committee Workshop. The workshop title was Libraries and Research Data Management – What Works?
A short, retrospective presentation given as part of the #10yearsDMPonline celebrations in November 2020. I product-managed the first few iterations of this free software tool.
'Found' and 'after' - a short history of data reuse in the artsMartin Donnelly
A presentation prepared as emergency backup for RDMF10 (http://www.dcc.ac.uk/events/research-data-management-forum-rdmf/rdmf10-research-data-management-arts-and-humanities), while we were struggling to secure a replacement keynote speaker. It was fun to prepare, though, so here it is, minus the multimedia bits such as the sound files on the 'sampling' slide.
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Enhancing Performance with Globus and the Science DMZGlobus
ESnet has led the way in helping national facilities—and many other institutions in the research community—configure Science DMZs and troubleshoot network issues to maximize data transfer performance. In this talk we will present a summary of approaches and tips for getting the most out of your network infrastructure using Globus Connect Server.
Free Complete Python - A step towards Data Science
Managing and Sharing Research Data: Good practices for an ideal world...in the real world.
1. Managing and Sharing
Research Data:
Good Practices for an Ideal World…
in the Real World
Martin Donnelly
Digital Curation Centre
University of Edinburgh
University of Sheffield
19 January 2012
2. Running Order
1. Introduction
2. What is meant by managing research data?
3. Research data management and research ethics/integrity
4. Context and policy
5. The Why
Pt. 1 – It’s A Good Thing
Pt. 2 – Carrots
Pt. 3 – Sticks
6. Practicalities and Moving Forward
7. Sheffield Stories
8. Last Words
9. Q+A
3. Running Order
1. Introduction
2. What is meant by managing research data?
3. Research data management and research ethics/integrity
4. Context and policy
5. The Why
Pt. 1 – It’s A Good Thing
Pt. 2 – Carrots
Pt. 3 – Sticks
6. Practicalities and Moving Forward
7. Sheffield Stories
8. Last Words
9. Q+A
4. Digital Curation Centre
- Founded in 2004 to support research in UK higher and further
education in the preservation, curation and management of
digital resources
- Major funder is JISC
- Original focus on publications / biblio; now more emphasis on
research data management
- Support to JISC projects, especially the two Managing Research
Data programmes...
http://www.jisc.ac.uk/whatwedo/programmes/di_researchman
agement/managingresearchdata.aspx
- Tools, training, guidance, consultancy, other resources/studies…
- Three partner sites: Edinburgh (lead), Bath and Glasgow
5. Running Order
1. Introduction
2. What is meant by managing research data?
3. Research data management and research ethics/integrity
4. Context and policy
5. The Why
Pt. 1 – It’s A Good Thing
Pt. 2 – Carrots
Pt. 3 – Sticks
6. Practicalities and Moving Forward
7. Sheffield Stories
8. Last Words
9. Q+A
6. What is meant by managing
research data?
Lots of strands…
- Ensuring physical integrity of files and helping to preserve them
- Ensuring safety of content (data protection, ethics, etc)
- Describing the data (via metadata) and recording its history
- Providing or enabling appropriate access at the right time, or
restricting access, as appropriate
- Transferring custody at some point, and possibly destroying
In short, RDM means meeting funder, institutional,
disciplinary and other requirements/norms across various
areas and at different times, in sympathy with the nature
of the data itself, for the benefit of yourself, your
institution, and the wider community, as appropriate.
7. Running Order
1. Introduction
2. What is meant by managing research data?
3. Research data management and research ethics/integrity
4. Context and policy
5. The Why
Pt. 1 – It’s A Good Thing
Pt. 2 – Carrots
Pt. 3 – Sticks
6. Practicalities and Moving Forward
7. Sheffield Stories
8. Last Words
9. Q+A
8. RDM and research ethics/integrity
- RDM is increasingly seen as a core research competency, along with things
like writing and referencing (see RCUK Common Principles >>)
9. Policy Streamlining
RCUK Common Principles on Data Policy
Key messages:
1. Data are a public good
2. Adherence to community standards and best practice
3. Metadata for discoverability and access
4. Recognise constraints on what data to release
5. Permit embargo periods delaying data release
6. Acknowledgement of / compliance with T&Cs
7. Data management and sharing activities should be explicitly funded
http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
10. RDM and research ethics/integrity
- RDM is increasingly seen as a core research competency, along with things
like writing and referencing (see RCUK principles >>)
- Research outputs (which constitute the scientific record) are often based on
the collection, analysis and processing of data / sources / information
- Reproducibility and verifiability are fundamental principles in many
disciplines. In other disciplines, including those where research cannot be
replicated such as social and environmental sciences, the longevity of the
data from which the findings are derived is equally crucial
- Some data is unique and cannot be replaced if destroyed or lost, yet only by
referring to trustworthy data can research be judged as sound
- Therefore data must be accessible and comprehensible in order to back up
claims, and enable third parties to reproduce (or validate) results
- Additionally, there is increasing demand for public (or Open) access to
publicly-funded research outputs, including data, but more on that later…
11. Running Order
1. Introduction
2. What is meant by managing research data?
3. Research data management and research ethics/integrity
4. Context and policy
5. The Why
Pt. 1 – It’s A Good Thing
Pt. 2 – Carrots
Pt. 3 – Sticks
6. Practicalities and Moving Forward
7. Sheffield Stories
8. Last Words
9. Q+A
12. Institutional and funder perspectives
- Research today is technology enabled and data intensive
- Data as long-term asset; identify and preserve
- The fragility and cost of digital data; curate to reuse and
preserve
- Data sharing: research pooling, cross-disciplinary and global
partnering, new research from old, the wealth of knowledge
- The cost of technology and human infrastructures
- Pressure to show return on public investment of £3.5bn
- Compliance with legislation and funder policies
- The data deluge: volume and complexity, not just in HEIs
- Financial and human consequences from lost data
- The cost of administering unmanaged datasets
13. Context
“For science to effectively function, and for society to
reap the full benefits from scientific endeavours, it is
crucial that science data be made open”
Surfing the Tsunami
Science, 11 February 2011
15. Policy
RCUK Policy and Code of Conduct on the Governance
ofEPSRCResearchall those institutions it October 2011)
Good expects Conduct, 2008 (updated funds
UNACCEPTABLEroadmap that aligns theirmismanagement or
to develop a RESEARCH CONDUCT includes policies and
inadequate preservation of data and/or primary materials,st May 2012;
processes with EPSRC’s expectations by 1 including failure
to:
to be fully compliantrecords these expectations by 1st May
keep clear and accurate
with of the research procedures followed and the
2015. obtained, including interim results;
results
Compliance securely inmonitored andform;
hold records will be paper or electronic non-compliance
investigated. primary data and research evidence accessible to others for
make relevant
Failure to share research data could result datathe normally
reasonable periods after the completion of the research:
in should
be preserved and accessible for 10 yrs (in some cases 20 yrs or longer);
imposition of sanctions. research funder‟s data policy and all relevant
manage data according to the
legislation;
wherever possible, deposit data permanently within a national collection.
Responsibility for proper management and preservation of data and primary
materials is shared between the researcher and the research organisation.
16. Running Order
1. Introduction
2. What is meant by managing research data?
3. Research data management and research ethics/integrity
4. Context and policy
5. The Why
Pt. 1 – It’s A Good Thing
Pt. 2 – Carrots
Pt. 3 – Sticks
6. Practicalities and Moving Forward
7. Sheffield Stories
8. Last Words
9. Q+A
17. The Why (pt. 1)
It’s A Good Thing
– Data as a public good (see RCUK Shared Principles)
– Others can build upon your work (the Shoulders of
Giants, Newton) and it may be useful in ways you did
not foresee, beyond your discipline (‘fresh eyes and
new techniques or approaches’)
– Passing custody enables you to leave the preservation
legwork to the specialists
– You won’t be around forever, but your work might be
18. The Why (pt. 2)
Incentives, or “Why Should I Spend Time On This
When I Have Other Things To Worry About?”
- Impact. Linking papers to data increases citation rates,
see for example Henneken & Accomazzi, Smithsonian
Astrophysical Observatory:
http://arxiv.org/PS_cache/arxiv/pdf/1111/1111.3618v
1.pdf (pre-print)
- Warning! Some numbers follow…
19. Institutional cost saving
Researcher career benefits
Growing popularity of re-use
Sharing as a catalyst
for discovery
http://www.dcc.ac.uk/resources/briefing-papers
21. Impact
- Making data accessible increases citation rates
- Better for authors; better for publishers
- Piwowar, Day & Fridsma (2007):
- 45% of studies make data accessible
- They receive 85% of citations
- N.B correlation is not causation…
doi:10.1371/journal.pone.0000308
4th DCC Roadshow - Oxford. Kevin Ashley,
2011-09-14 21
DCC, CC-BY-SA
22. Key findings
- 2.98 more publications per
dataset if archived
3
- 2.77 more if „informally
shared‟ 2.5
“TheOr correct forof social science research: The use and reuse of primary
- enduring value some 2
research data” Archived
confounding factors…
Amy M. Pienta, George Alter, Jared Lyle 1.5
Shared
http://hdl.handle.net/2027.42/78307
- 2.42 more if archived 1
Not shared1
Presented in Torino, April 2010: “Organisation, Economics and Policy of Scientific
Research”more if informally
- 2.31 0.5
shared 0
Raw Corrected
2011-09-14 4th DCC Roadshow - Oxford. Kevin Ashley, DCC, CC-BY-SA 22
23. The Why (pt. 2)
More incentives…
- Increased citations help with the
Research Excellence Framework
- Research councils are increasingly
rejecting submissions on the basis of
poor data management plans
- So you get more funding if you do
this right…
24. The Why (pt. 3)
Sticks…
- Some funders require you to make your data available for many
years after project funding has ceased. So laying adequate data
preservation foundations should be near the top of your list
when planning any new research project.
- Funder rejections on basis of poor data management.
- EPSRC roadmap requirement (N.B. It is likely that DMPs will form
part of many institutional infrastructures) - the institution has
overall responsibility for this, but everyone will need to play a
part, and EPSRC is an important funder at Sheffield. Others may
follow suit…
25. The Why (pt. 3)
Government pressure on RCs…
6.9 The Research Councils expect the researchers they fund to deposit published
articles or conference proceedings in an open access repository at or around the
time of publication. But this practice is unevenly enforced. Therefore, as an
immediate step, we have asked the Research Councils to ensure the researchers
they fund fulfil the current requirements. Additionally, the Research Councils
have now agreed to invest £2 million in the development, by 2013, of a UK
‘Gateway to Research’. In the first instance this will allow ready access to
Research Council funded research information and related data but it will be
designed so that it can also include research funded by others in due course. The
Research Councils will work with their partners and users to ensure information is
presented in a readily reusable form, using common formats and open standards.
http://www.bis.gov.uk/assets/biscore/innovation/docs/i/11-1387-innovation-
and-research-strategy-for-growth.pdf
26. The Why (pt. 3)
- In addition to funders and institutions, prestige journals like Science and Nature already
have data policies in place, and the tendency is towards increasing requirements and
scrutiny here as well as with the funders…
Nature and Science data policies
Nature
Such material must be hosted on an accredited independent site (URL and accession numbers to be provided by the author), or sent to the Nature journal
at submission, either uploaded via the journal's online submission service, or if the files are too large or in an unsuitable format for this purpose, on
CD/DVD (five copies). Such material cannot solely be hosted on an author's personal or institutional web site.[4]
Nature requires the reviewer to determine if all of the supplementary data and methods have been archived. The policy advises reviewers to consider
several questions, including: "Should the authors be asked to provide supplementary methods or data to accompany the paper online? (Such data might
include source code for modelling studies, detailed experimental protocols or mathematical derivations.)"[5]
Science
‘’’Database deposition policy’’’ – Science supports the efforts of databases that aggregate published data for the use of the scientific community.
Therefore, before publication, large data sets (including microarray data, protein or DNA sequences, and atomic coordinates or electron microscopy
maps for macromolecular structures) must be deposited in an approved database and an accession number provided for inclusion in the published
paper.[6]
‘’’Materials and methods’’’ – Science now requests that, in general, authors place the bulk of their description of materials and methods online as
supporting material, providing only as much methods description in the print manuscript as is necessary to follow the logic of the text. (Obviously, this
restriction will not apply if the paper is fundamentally a study of a new method or technique.)[7]
REFERENCES
^"Availability of Data and Materials: The Policy of Nature Magazine[4]
^ "Guide to Publication Policies of the Nature Journals," published March 14, 2007.[5]
^ "General Policies of Science Magazine" [6]
^ ”Preparing Your Supporting Online Material” [7]
- Finally, a data management plan requirement is very likely to feature in EC FP8 (“Horizon
27. Running Order
1. Introduction
2. What is meant by managing research data?
3. Research data management and research ethics/integrity
4. Context and policy
5. The Why
Pt. 1 – It’s A Good Thing
Pt. 2 – Carrots
Pt. 3 – Sticks
6. Practicalities and Moving Forward
7. Sheffield Stories
8. Last Words
9. Q+A
28. Practicalities
…or, Areas Where The DCC Can Help
- Assessing Need
- Delivering Support
- Developing Strategic Institutional
Research Data Management Support
- Policy
- Advocacy
- Planning
- Tools
- Training
www.dcc.ac.uk
29. Three areas for thought
1. Documentation and metadata
2. Backup
3. Depositing data for the long term
30. Documentation and Metadata
- Could you, or someone else, make sense of
your data five years from now? What about
five minutes from now?
- Metadata is ‘data about data’
- Simple documentation (study level)
– Use consistent file names and informative labels
– Version control
– E.g. ABC_Study4_output_2012-01-19_v1.xls
31. Documentation and Metadata
- You may wish to maintain a separate log of high
level metadata about each dataset (text file,
spreadsheet or database)
- Research context (when, where, who)
- Data history (preparation, processing)
- Where and how to access the data
- Access rights and permissions
- Link to supplementary materials, related data,
documents, publications
- Wherever possible, use standardised
vocabularies and metadata formats
32. Backup
- What would happen to your data if there was a
fire in your office tonight?
- Automatic backup
- Find out if this is available in your Department or
School
- Best practice is at least one automatic off-site
backup
- Manual backup
- Set repeat reminders, e.g. via online calendar
- N.B. Backup and archiving are not same thing!
33. Depositing Data for the Long Term
- Check copyright, consent and Data Protection
status
- Identify the appropriate archive / data centre
- Submit form/sample data/supporting
documentation for review
- If accepted, sign Licence Agreement
- Deposit data
- Dissemination?
34. That’s a lot to remember…
It is, but the DCC’s Checklist
for a Data Management Plan
provides a comprehensive list
of issues you might need to
consider…
Not all of it will be relevant to
your work. Start with the
section headings, and use
DMP Online to make your life
easier…
37. Moving Forward
There are lots of guidance resources
available already, e.g.
www.lib.cam.ac.uk/preservation/incremental/
and www.glasgow.ac.uk/datamanagement and
Research Data MANTRA
http://datalib.edina.ac.uk/mantra/
… and Sheffield-focused resources are on the
way.
38. Running Order
1. Introduction
2. What is meant by managing research data?
3. Research data management and research ethics/integrity
4. Context and policy
5. The Why
Pt. 1 – It’s A Good Thing
Pt. 2 – Carrots
Pt. 3 – Sticks
6. Practicalities and Moving Forward
7. Sheffield Stories
8. Last Words
9. Q+A
43. Running Order
1. Introduction
2. What is meant by managing research data?
3. Research data management and research ethics/integrity
4. Context and policy
5. The Why
Pt. 1 – It’s A Good Thing
Pt. 2 – Carrots
Pt. 3 – Sticks
6. Practicalities and Moving Forward
7. Sheffield Stories
8. Last Words
9. Q+A
44. Last Words
- You may be in a small group with not much capacity for
huge changes, but no one expects miracles
- Starting with incremental changes now is better than
burying your head in the sand and hitting a brick wall
later
- You’re not alone! There are lots of resources available,
both institutionally and at a national level
45. Running Order
1. Introduction
2. What is meant by managing research data?
3. Research data management and research ethics/integrity
4. Context and policy
5. The Why
Pt. 1 – It’s A Good Thing
Pt. 2 – Carrots
Pt. 3 – Sticks
6. Practicalities and Moving Forward
7. Sheffield Stories
8. Last Words
9. Q+A
46. Q+A
FAQ’s pt. 1
Q. I don’t have time for all of this.
A. You should have: the RCUK councils explicitly state that data management
activities should be included as part of funding applications, and institutions
are bound to meet their obligations. It’s not necessary for every researcher to
become an expert in all aspects of RDM, just to know what their role is in the
bigger picture.
Q. How are data management plans actually assessed?
A. It varies from funder to funder. The AHRC has a technical review college,
and ADS has internal guidance on what to look for when marking. All funders
provide markers' guidelines which probably say something about DMPs, but
these tend not to be public documents. A notable exception is ESRC, where
markers’ guidance is produced by the UK Data Archive. We’re hearing more
and more stories of bids rejected on the basis of poor DMPs, so the review
processes may soon become more transparent. Interestingly, the AHRC crops
up in this context more often than the others.
47. Q+A
FAQ’s pt. 2
Q. Won’t sharing my data mean people can steal my work?
A. No. Others might find things you didn’t (or weren’t looking for), but you
should receive proper attribution. Additionally, most funders permit
embargo periods to enable the original data collectors/creators to benefit
from their work. The risk of plagiarism is the same as publishing a paper.
Q. How could I possibly share confidential data?
A. If it’s confidential, you probably shouldn’t! Techniques such as
anonymisation and aggregation can be applied in order to safeguard
personal information, and data with commercial significance may also be
protected. It depends on policies and consortium agreements etc, which
should be clearly communicated. ESRC/UKDA, for example, provide advice
on ‘What to tell participants’ re. confidentiality /
anonymisationhttp://www.data-archive.ac.uk/create-manage/consent-
ethics/consent?index=7
48. Thank you
Martin Donnelly
Digital Curation Centre
University of Edinburgh
www.dcc.ac.uk/dmponline
martin.donnelly@ed.ac.uk
Twitter: @mkdDCC
This work is licensed under the Creative
Commons Attribution-NonCommercial-ShareAlike
2.5 UK: Scotland License. Image credits:
To view a copy of this license, (a) visit slide 12 -http://www.psdgraphics.com/3d/gold-pound-symbol/
http://creativecommons.org/licenses/by-nc-
sa/2.5/scotland/; or (b) send a letter to Creative
Commons, 543 Howard Street, 5th Floor, San Slide credits:
Francisco, California, 94105, USA. Kevin Ashley and Graham Pryor, DCC Edinburgh; Andrew McHugh, DCC Glasgow