Presentation given at 2nd DPHEP Collaboration Workshop at CERN on 12 March 2017. The talk reflects on machine-actionable DMP use cases developed during an IDCC workshop (http://www.dcc.ac.uk/events/workshops/postcard-future-tools-and-services-perfect-dmp-world) and activities underway or planned by the DCC and UC3 teams. These will be implemented in the DMPRoadmap codebase to pilot in our respective tools, DMPonline and DMPTool
This document discusses data management plans (DMPs), which are brief plans that define how research data will be created, documented, stored, shared, and preserved. DMPs are often required as part of grant applications. The document provides an overview of why DMPs are important, how they benefit researchers and institutions, and key aspects to address in a DMP such as data organization, stakeholders, and making data FAIR (findable, accessible, interoperable, and reusable). Examples of DMPs from real projects are also presented.
Initially prepared for the CERN/RDA workshop on Active Data Management Plans (28-30 June 2016). Also presented in Denver at International Data Week (12-17 Sept 2016).
The document discusses guidelines and resources for open research data under Horizon 2020, including the Open Research Data pilot. It provides an overview of key guidelines and requirements, such as developing a data management plan, selecting which data to openly license and share, using standards for interoperability and metadata, depositing data in repositories, and finding discipline-specific infrastructure and support. Resources highlighted include guidelines on licensing, the EUDAT licensing tool, Zenodo and other repositories, metadata standards directories, and training from FOSTER and OpenAIRE.
This document provides guidance on developing research data management services at universities. It discusses 10 key steps: 1) Understanding current practices, 2) Deciding what services are needed, 3) Balancing the needs of stakeholders, 4) Securing input and buy-in, 5) Defining roles and responsibilities, 6) Positioning support appropriately, 7) Balancing internal and external provision, 8) Being agile and adaptable to change, 9) Linking systems to integrate services, and 10) Planning for long-term sustainability. The overall message is that developing effective RDM requires understanding user needs, engaging stakeholders, and continually adapting services.
An introduction to Research Data Management and Data Management Planning for research managers and administrators. The presentation was given at the Open University on 18th July 2013.
This document discusses data management plans (DMPs), which are brief plans that define how research data will be created, documented, stored, shared, and preserved. DMPs are often required as part of grant applications. The document provides an overview of why DMPs are important, how they benefit researchers and institutions, and key aspects to address in a DMP such as data organization, stakeholders, and making data FAIR (findable, accessible, interoperable, and reusable). Examples of DMPs from real projects are also presented.
Initially prepared for the CERN/RDA workshop on Active Data Management Plans (28-30 June 2016). Also presented in Denver at International Data Week (12-17 Sept 2016).
The document discusses guidelines and resources for open research data under Horizon 2020, including the Open Research Data pilot. It provides an overview of key guidelines and requirements, such as developing a data management plan, selecting which data to openly license and share, using standards for interoperability and metadata, depositing data in repositories, and finding discipline-specific infrastructure and support. Resources highlighted include guidelines on licensing, the EUDAT licensing tool, Zenodo and other repositories, metadata standards directories, and training from FOSTER and OpenAIRE.
This document provides guidance on developing research data management services at universities. It discusses 10 key steps: 1) Understanding current practices, 2) Deciding what services are needed, 3) Balancing the needs of stakeholders, 4) Securing input and buy-in, 5) Defining roles and responsibilities, 6) Positioning support appropriately, 7) Balancing internal and external provision, 8) Being agile and adaptable to change, 9) Linking systems to integrate services, and 10) Planning for long-term sustainability. The overall message is that developing effective RDM requires understanding user needs, engaging stakeholders, and continually adapting services.
An introduction to Research Data Management and Data Management Planning for research managers and administrators. The presentation was given at the Open University on 18th July 2013.
This document discusses drivers and organizational responses to research data management (RDM) maturity from transatlantic perspectives. It describes external funder mandates in the US and UK that require open sharing of research publications and data. Universities have responded by developing RDM policies, tools, expertise, and education/outreach for researchers. Key RDM components discussed include policies, storage and repository tools, expertise and staffing models, and outreach/education activities. Connecting electronic lab notebooks to other RDM infrastructure is presented as an approach to better integrate researcher workflows with institutional RDM. The document concludes with an invitation to provide comments on RDM maturity through an online survey.
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...EUDAT
| www.eudat.eu | 1st Session: July 7, 2016.
In this webinar, Sarah Jones (DCC) and Marjan Grootveld (DANS) talked through the aspects that Horizon 2020 requires from a DMP. They discussed examples from real DMPs and also touched upon the Software Management Plan, which for some projects can be a sensible addition
This document discusses several studies on user engagement in research data curation. It finds that institutional repositories for data were developed without input from researchers, leading to systems that did not meet researchers' needs. Barriers to open data sharing included concerns over commercial use and maintaining ownership. Successful data curation requires understanding disciplinary differences and developing trusted relationships with researchers through dialogue early in projects.
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu | EUDAT
| www.eudat.eu | B2SHARE is a scientific data repository providing persistent storage and sharing data facilities. Building on the new Invenio 3.0 digital assets management platform, a new version of B2SHARE has been developed which is focused on an improved user experience. Answering the requests of the current user base, B2SHARE version 2 provides customizable metadata schemas and a simple but effective workflow for depositing user data, exposed in its RESTful HTTP API.
The presentation will introduce the B2SHARE service, its organizing principles and its basic operations. The metadata schemas and the dataset lifecycle, which are essentials in understanding the possibilities of the service, will be the main focus of the talk. The concrete output of the session can be a full paper expanding the presented topics.
Target Audience:Researchers of any scientific domain, which work with publishable data sets.
FAIR Ddata in trustworthy repositories: the basicsOpenAIRE
This video illustrates how certified digital repositories contribute to making and keeping research data findable, accessible, interoperable and reusable (FAIR). Trustworthy repositories support Open Access to data, as well as Restricted Access when necessary, and they offer support for metadata, sustainable and interoperable file formats, and persistent identifiers for future citation. Presented by Marjan Grootveld (DANS, OpenAIRE).
Main references
• Core Trust Seal for trustworthy digital repositories: https://www.coretrustseal.org/
• EUDAT FAIR checklist: https://doi.org/10.5281/zenodo.1065991
• European Commission’s Guidelines on FAIR data management: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
• FAIR data principles: www.force11.org/group/fairgroup/fairprinciples
• Overview of metadata standards and tools: https://rdamsc.dcc.ac.uk/
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...EUDAT
| www.eudat.eu | This webinar was co-organised by DANS, EUDAT and OpenAIRE and was held on 12th and 13th December 2016.
Everybody wants to play FAIR, but how do we put the principles into practice?
There is a growing demand for quality criteria for research datasets. In this webinar we will argue that the DSA (Data Seal of Approval for data repositories) and FAIR principles get as close as possible to giving quality criteria for research data. They do not do this by trying to make value judgements about the content of datasets, but rather by qualifying the fitness for data reuse in an impartial and measurable way. By bringing the ideas of the DSA and FAIR together, we will be able to offer an operationalization that can be implemented in any certified Trustworthy Digital Repository.
In 2014 the FAIR Guiding Principles (Findable, Accessible, Interoperable and Reusable) were formulated. The well-chosen FAIR acronym is highly attractive: it is one of these ideas that almost automatically get stuck in your mind once you have heard it. In a relatively short term, the FAIR data principles have been adopted by many stakeholder groups, including research funders.
The FAIR principles are remarkably similar to the underlying principles of DSA (2005): the data can be found on the Internet, are accessible (clear rights and licenses), in a usable format, reliable and are identified in a unique and persistent way so that they can be referred to. Essentially, the DSA presents quality criteria for digital repositories, whereas the FAIR principles target individual datasets.
In this webinar the two sets of principles will be discussed and compared and a tangible operationalization will be presented.
Research engagement in EUDAT| www.eudat.eu | EUDAT
| www.eudat.eu | EUDAT’s vision is to enable European researchers and practitioners from any research discipline to preserve, find, access, and process data in a trusted environment, as part of a Collaborative Data Infrastructure (CDI) conceived as a network of collaborating, cooperating centres, that combine community-specific data repositories with the permanence and persistence of some of Europe’s largest scientific data centres. EUDAT services are community driven solutions. This presentation describes the different ways EUDAT engages with the research communities
This presentation provides an introduction to the Open Research Data Pilot in Horizon 2020. It explains why research data management and open data are important, what the requirements of the open research data pilot are and how OpenAIRE can help you to manage your data, open it up and comply with your funders open research data policy.
- EC guidelines on open research data for H2020 project including the H2020 DMP template http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
- Online DMP tool with a template for H2020 projects https://dmponline.dcc.ac.uk/
- How to comply with the H2020 Open Research data requirements https://www.openaire.eu/how-to-comply-to-h2020-mandates-for-publications-2
- What is a data management plan and how to write one? https://www.openaire.eu/what-isa-data-management-plan-and-how-do-i-create-one
- For further questions and help, contact us at: https://www.openaire.eu/support/helpdesk
- For further information, check: https://www.openaire.eu/
1) The document outlines the PECAN Phase 2 project which developed a prototype entitlement registry to match up title information with institutional subscriptions and post-cancellation entitlement.
2) Key components of the prototype included designing an entitlement registry demonstrator to ingest and display data, assessing methods for automating data ingestion and maintaining record accuracy over time.
3) Challenges identified included the dynamic nature of deals and titles, defining packages, and developing standard data formats and workflows for publisher data supply to minimize manual intervention.
‘Good, better, best’? Examining the range and rationales of institutional dat...Robin Rice
Introduction to panel presentations from Universities of Edinburgh, Southampton, Yale, Cornell at IPRES 2015 conference, Chapel Hill, North Carolina, 3 Nov 2015
IASSIST40: Data management & curation workshopRobin Rice
The document summarizes Edinburgh DataShare, an open access data repository at the University of Edinburgh that supports the university's research data management policy. It stores a wide range of research data across disciplines. The repository uses the DSpace platform and is promoting open data, though getting some academics to deposit data can be challenging. It focuses on making metadata and data discoverable through various search tools and indexes. Basic quality assurance checks are performed during the self-deposit process.
Now we are six: Integrating Edinburgh DataShare into local and internet in...Robin Rice
#iassist40 presentation, Toronto, 6/6/2014.
Abstract:
Edinburgh DataShare, an institutional data repository, is six years old. It was built as a demonstrator in DSpace by EDINA and Data Library and has been given new life by the University of Edinburgh’s Research Data Management initiative. Following testing by pilot users in various departments last year, DataShare is confirmed as a key RDM service. Since 2008 much external infrastructure has grown around data sharing, and software developers, publishers and librarians are creating new innovations around the sharing and re-use of data daily. How can DataShare be shaped to fit in to this ever-more-sophisticated environment? A number of ongoing developments are helping us integrate the repository in the global context. DataShare is being indexed in Thomson-Reuter’s Data Citation Index. We aspire to attain the Data Seal of Approval for DataShare, a badge that confers trustworthiness through peer review. It is listed in re3data.org and databib registries of data repositories. We offer via extension, peer review of datasets to our depositors by listing journals that publish ‘data papers’ such as F1000 Research. Locally, as Information Services builds new data services such as the Data Store, [private data] Vault and the [metadata-only] Register, we can focus DataShare on its named purpose.
This document discusses open data and the process of archiving, documenting, quality checking, integrating, publishing, and redistributing data. It describes how data is archived at the Marine Data Archive and documented with metadata. Quality control ensures data is correctly interpreted and usable. Data is integrated and published through the Integrated Marine Information System with discovery metadata. A data policy advocates open data exchange and making data publicly available while recognizing the original source. The main challenges are convincing scientists to openly share data and having no mandates for participation.
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Tobias Kuhn
Tobias Kuhn and Michel Dumontier present Trusty URIs, which aim to provide verifiable, immutable, and permanent references to digital artifacts on the Semantic Web. Trusty URIs include a cryptographic hash of the artifact as part of the URI itself. This allows anyone to verify that a URI resolves to the unaltered original artifact. It also ensures URIs remain permanent references even if the original location becomes unavailable. Implementations exist for plain files and RDF graphs, with more modules planned. Evaluations show Trusty URIs successfully validate original artifacts while detecting corrupted copies, and can handle files from kilobytes to terabytes in size. The project is open source with the goal of building a community around developing and
This document discusses drivers and organizational responses to research data management (RDM) maturity from transatlantic perspectives. It describes external funder mandates in the US and UK that require open sharing of research publications and data. Universities have responded by developing RDM policies, tools, expertise, and education/outreach for researchers. Key RDM components discussed include policies, storage and repository tools, expertise and staffing models, and outreach/education activities. Connecting electronic lab notebooks to other RDM infrastructure is presented as an approach to better integrate researcher workflows with institutional RDM. The document concludes with an invitation to provide comments on RDM maturity through an online survey.
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...EUDAT
| www.eudat.eu | 1st Session: July 7, 2016.
In this webinar, Sarah Jones (DCC) and Marjan Grootveld (DANS) talked through the aspects that Horizon 2020 requires from a DMP. They discussed examples from real DMPs and also touched upon the Software Management Plan, which for some projects can be a sensible addition
This document discusses several studies on user engagement in research data curation. It finds that institutional repositories for data were developed without input from researchers, leading to systems that did not meet researchers' needs. Barriers to open data sharing included concerns over commercial use and maintaining ownership. Successful data curation requires understanding disciplinary differences and developing trusted relationships with researchers through dialogue early in projects.
B2SHARE: Record lifecycle and HTTP API| www.eudat.eu | EUDAT
| www.eudat.eu | B2SHARE is a scientific data repository providing persistent storage and sharing data facilities. Building on the new Invenio 3.0 digital assets management platform, a new version of B2SHARE has been developed which is focused on an improved user experience. Answering the requests of the current user base, B2SHARE version 2 provides customizable metadata schemas and a simple but effective workflow for depositing user data, exposed in its RESTful HTTP API.
The presentation will introduce the B2SHARE service, its organizing principles and its basic operations. The metadata schemas and the dataset lifecycle, which are essentials in understanding the possibilities of the service, will be the main focus of the talk. The concrete output of the session can be a full paper expanding the presented topics.
Target Audience:Researchers of any scientific domain, which work with publishable data sets.
FAIR Ddata in trustworthy repositories: the basicsOpenAIRE
This video illustrates how certified digital repositories contribute to making and keeping research data findable, accessible, interoperable and reusable (FAIR). Trustworthy repositories support Open Access to data, as well as Restricted Access when necessary, and they offer support for metadata, sustainable and interoperable file formats, and persistent identifiers for future citation. Presented by Marjan Grootveld (DANS, OpenAIRE).
Main references
• Core Trust Seal for trustworthy digital repositories: https://www.coretrustseal.org/
• EUDAT FAIR checklist: https://doi.org/10.5281/zenodo.1065991
• European Commission’s Guidelines on FAIR data management: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
• FAIR data principles: www.force11.org/group/fairgroup/fairprinciples
• Overview of metadata standards and tools: https://rdamsc.dcc.ac.uk/
FAIR Data in Trustworthy Data Repositories Webinar - 12-13 December 2016| www...EUDAT
| www.eudat.eu | This webinar was co-organised by DANS, EUDAT and OpenAIRE and was held on 12th and 13th December 2016.
Everybody wants to play FAIR, but how do we put the principles into practice?
There is a growing demand for quality criteria for research datasets. In this webinar we will argue that the DSA (Data Seal of Approval for data repositories) and FAIR principles get as close as possible to giving quality criteria for research data. They do not do this by trying to make value judgements about the content of datasets, but rather by qualifying the fitness for data reuse in an impartial and measurable way. By bringing the ideas of the DSA and FAIR together, we will be able to offer an operationalization that can be implemented in any certified Trustworthy Digital Repository.
In 2014 the FAIR Guiding Principles (Findable, Accessible, Interoperable and Reusable) were formulated. The well-chosen FAIR acronym is highly attractive: it is one of these ideas that almost automatically get stuck in your mind once you have heard it. In a relatively short term, the FAIR data principles have been adopted by many stakeholder groups, including research funders.
The FAIR principles are remarkably similar to the underlying principles of DSA (2005): the data can be found on the Internet, are accessible (clear rights and licenses), in a usable format, reliable and are identified in a unique and persistent way so that they can be referred to. Essentially, the DSA presents quality criteria for digital repositories, whereas the FAIR principles target individual datasets.
In this webinar the two sets of principles will be discussed and compared and a tangible operationalization will be presented.
Research engagement in EUDAT| www.eudat.eu | EUDAT
| www.eudat.eu | EUDAT’s vision is to enable European researchers and practitioners from any research discipline to preserve, find, access, and process data in a trusted environment, as part of a Collaborative Data Infrastructure (CDI) conceived as a network of collaborating, cooperating centres, that combine community-specific data repositories with the permanence and persistence of some of Europe’s largest scientific data centres. EUDAT services are community driven solutions. This presentation describes the different ways EUDAT engages with the research communities
This presentation provides an introduction to the Open Research Data Pilot in Horizon 2020. It explains why research data management and open data are important, what the requirements of the open research data pilot are and how OpenAIRE can help you to manage your data, open it up and comply with your funders open research data policy.
- EC guidelines on open research data for H2020 project including the H2020 DMP template http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
- Online DMP tool with a template for H2020 projects https://dmponline.dcc.ac.uk/
- How to comply with the H2020 Open Research data requirements https://www.openaire.eu/how-to-comply-to-h2020-mandates-for-publications-2
- What is a data management plan and how to write one? https://www.openaire.eu/what-isa-data-management-plan-and-how-do-i-create-one
- For further questions and help, contact us at: https://www.openaire.eu/support/helpdesk
- For further information, check: https://www.openaire.eu/
1) The document outlines the PECAN Phase 2 project which developed a prototype entitlement registry to match up title information with institutional subscriptions and post-cancellation entitlement.
2) Key components of the prototype included designing an entitlement registry demonstrator to ingest and display data, assessing methods for automating data ingestion and maintaining record accuracy over time.
3) Challenges identified included the dynamic nature of deals and titles, defining packages, and developing standard data formats and workflows for publisher data supply to minimize manual intervention.
‘Good, better, best’? Examining the range and rationales of institutional dat...Robin Rice
Introduction to panel presentations from Universities of Edinburgh, Southampton, Yale, Cornell at IPRES 2015 conference, Chapel Hill, North Carolina, 3 Nov 2015
IASSIST40: Data management & curation workshopRobin Rice
The document summarizes Edinburgh DataShare, an open access data repository at the University of Edinburgh that supports the university's research data management policy. It stores a wide range of research data across disciplines. The repository uses the DSpace platform and is promoting open data, though getting some academics to deposit data can be challenging. It focuses on making metadata and data discoverable through various search tools and indexes. Basic quality assurance checks are performed during the self-deposit process.
Now we are six: Integrating Edinburgh DataShare into local and internet in...Robin Rice
#iassist40 presentation, Toronto, 6/6/2014.
Abstract:
Edinburgh DataShare, an institutional data repository, is six years old. It was built as a demonstrator in DSpace by EDINA and Data Library and has been given new life by the University of Edinburgh’s Research Data Management initiative. Following testing by pilot users in various departments last year, DataShare is confirmed as a key RDM service. Since 2008 much external infrastructure has grown around data sharing, and software developers, publishers and librarians are creating new innovations around the sharing and re-use of data daily. How can DataShare be shaped to fit in to this ever-more-sophisticated environment? A number of ongoing developments are helping us integrate the repository in the global context. DataShare is being indexed in Thomson-Reuter’s Data Citation Index. We aspire to attain the Data Seal of Approval for DataShare, a badge that confers trustworthiness through peer review. It is listed in re3data.org and databib registries of data repositories. We offer via extension, peer review of datasets to our depositors by listing journals that publish ‘data papers’ such as F1000 Research. Locally, as Information Services builds new data services such as the Data Store, [private data] Vault and the [metadata-only] Register, we can focus DataShare on its named purpose.
This document discusses open data and the process of archiving, documenting, quality checking, integrating, publishing, and redistributing data. It describes how data is archived at the Marine Data Archive and documented with metadata. Quality control ensures data is correctly interpreted and usable. Data is integrated and published through the Integrated Marine Information System with discovery metadata. A data policy advocates open data exchange and making data publicly available while recognizing the original source. The main challenges are convincing scientists to openly share data and having no mandates for participation.
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Tobias Kuhn
Tobias Kuhn and Michel Dumontier present Trusty URIs, which aim to provide verifiable, immutable, and permanent references to digital artifacts on the Semantic Web. Trusty URIs include a cryptographic hash of the artifact as part of the URI itself. This allows anyone to verify that a URI resolves to the unaltered original artifact. It also ensures URIs remain permanent references even if the original location becomes unavailable. Implementations exist for plain files and RDF graphs, with more modules planned. Evaluations show Trusty URIs successfully validate original artifacts while detecting corrupted copies, and can handle files from kilobytes to terabytes in size. The project is open source with the goal of building a community around developing and
The Be-All, End-All List of Small Business Tax DeductionsWagepoint
Read the full article with even more details at https://blog.wagepoint.com/h/i/289427271-the-comprehensive-list-of-small-business-tax-deductions/185037
10 Things You Didn’t Know About Mobile Email from Litmus & HubSpotHubSpot
The document discusses key insights about mobile email usage and optimization. It shows that mobile email opens have grown 600% from 2011-2016, with over 70% of emails now being opened on mobile devices. When emails look bad on mobile, over 80% of users will still read them. The document provides tips for optimizing elements like preview text, links, text sizes, touch targets, and layouts for mobile. It also discusses different mobile email design approaches and resources for templates.
Modern Prospecting Techniques for Connecting with Prospects (from Sales Hacke...HubSpot
Sales is a difficult world to be in because buyers aren't putting up with salespeople anymore. Instead of helping and building relationships, sales reps are still focused on closing prospects - even when they aren't ready to buy! So buyers ignore them. Because of that, even great sales reps would be lucky to get on the phone with someone.
While buyers have evolved and become more sophisticated, sales reps and training programs have been slow to adapt to that change.
Learn actionable modern prospecting techniques you can apply immediately from two best selling authors and sales experts: Max Altschuler CEO of Sales Hacker, and Mark Roberge CRO of HubSpot.
This document summarizes diversity data from HubSpot in 2016. It shows the breakdown of employees by gender, age, ethnicity, and management level across different departments. While diversity is still lacking, especially in technical roles and leadership, progress was made in 2016 with increases in female representation and hiring of underrepresented ethnic groups. Continued efforts are needed to create a more inclusive workforce.
Why People Block Ads (And What It Means for Marketers and Advertisers) [New R...HubSpot
HubSpot Research shares new data on why people use ad blockers and what marketers and advertisers need to do to keep people from blocking out ads completely. Hint: it's stop using interruptive and annoying ads.
3 Proven Sales Email Templates Used by Successful CompaniesHubSpot
76% of emails never get opened. That makes life for salespeople very difficult. So we've partnered up with Breakthrough Email to bring you email templates that are proven to engage prospects and close more deals. Start using them today and grow your revenue.
The document summarizes the work of the NASA Astrophysics Data System (ADS) to increase access to astronomical literature and data. ADS is the primary index of astronomical publications, containing over 10 million records. The presentation discusses ADS efforts to link publications to related datasets, telescope bibliographies, and other resources. It also introduces Zenodo as a new platform for archiving and sharing data, publications, and other materials in astronomy.
This document provides an overview of open science and how to practice open science. It defines open science as research carried out and communicated in a way that allows others to contribute and collaborate. The benefits of open science include increased visibility, citations, and economic benefits when data is freely available. It recommends publishing papers through open access routes, sharing data and code openly with permissive licenses, and depositing outputs in repositories to practice open science. The document provides guidance on choosing file formats, metadata standards, and repositories to openly share research outputs.
1) The document summarizes the Horizon 2020 Open Data Pilot, which requires projects in certain areas to make research data openly available.
2) It outlines the benefits of open data such as faster scientific breakthroughs and economic benefits.
3) Key requirements of the pilot include depositing data in a research repository, making it accessible and reusable by third parties, and developing a Data Management Plan. The document provides guidance and tools to help researchers comply.
Michigan State University campus policy, resources and best practices for research data management offered by the MSU Libraries Research Data Management Guidance service. http://www.lib.msu.edu/rdmg/
The slides that will accompany my live webcast for OpenCon 2014 attendees, all about open data in research. The benefits, the how to (both legally & technically), examples, pitfalls, and the future of open research data.
A basic course on Research data management: part 1 - part 4Leon Osinski
Slides belonging to a basic course on research data management. The course consists of 4 parts:
Part 1: what and why
1.1 data management plans
Part 2: protecting and organizing your data
2.1 data safety and data security
2.2 file naming, organizing data (TIER documentation protocol)
Part 3: sharing your data
3.1 via collaboration platforms (during research)
3.2 via data archives (after your research)
Part 4: caring for your data, or making data usable
4.1 tidy data
4.2 documentation/metadata
4.3 licenses
4.4 open data formats
This document summarizes work on developing machine-actionable data management plans (DMPs). It discusses a workshop at CERN where participants from different fields and countries explored how to make DMPs more active and enable data to be exploited over the long term. Key points included identifying use cases, understanding researcher needs, and prioritizing interoperability, persistent identifiers, capacity planning, and increasing data discovery and reuse. Next steps include developing pilot projects to test machine-actionable DMPs in practice.
Presentació a càrrec de Mireia Alcalá, tècnica de Recursos d'Informació al CSUC, duta a terme al workshop en línia "Research Data Management & Open Science" organitzat per l'IDIBELL el 2 de novembre de 2020.
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...Sarah Anna Stewart
Presentation given at the M25 Consortium of Academic Libraries, CPD25 Event on 'The Role of the Library in Supporting Research'. Provides an introduction to data, software and PIDs and a brief look at how libraries can enable researchers to gain impact and credit for their research data and software.
Facing the Data Challenge: Institutions, Disciplines, Services and RisksLizLyon
This document summarizes a presentation on managing research data challenges. It discusses gathering requirements from researchers, assessing existing data support services, conducting a skills audit, and developing a strategic plan. Key points include analyzing gaps in current services, prioritizing resources, developing skills through training, clarifying roles and responsibilities, and creating short and long-term action plans to optimize research data management support. The goal is to understand researcher needs, strengthen collaboration between support units, and engage proactively to help address data challenges.
Open data in ubi systems research data management plan (part 4)Heli Väätäjä
This slideset motivates to creating a data management plan and gives initial advice. Slides are from the seminar on Open Data in Ubiquitous Systems Research aimed for doctoral students in HCI and CS.
Incentivising the uptake of reusable metadata in the survey production processLouise Corti
This document discusses incentivizing the uptake of reusable metadata in survey production. It notes that there is no universal language used to document survey questions and variables, leading to wasted resources. The Data Documentation Initiative (DDI) is proposed as a standard. Barriers to adopting metadata best practices include legacy systems, manual processes, and reluctance to change. The document outlines ideas to incentivize metadata use such as specifying documentation requirements in funding calls and improving documentation tools and workflows. Showing tangible benefits through applications like question banks and data exploration systems is also suggested.
Cal Poly - Data Management and the DMPToolCarly Strasser
October 17, 2013 @ Robert E. Kennedy Library, Data Studio, California Polytechnic State University.
Many funders now require researchers to submit a Data Management Plan alongside their project proposals. The DMPTool is a free, online wizard that helps you create a data management plan specific to your project, and provides you with links and resources for ensuring your plan is successful.
Turning FAIR into Reality - Role for Libraries dri_ireland
Presentation by Dr. Natalie Harrower, Director Digital Repository of Ireland and European Commission FAIR data expert group member, on what role librarians can play in the FAIR ecosystem. "Applying the FAIR data principles in day-to-day library practice" session by the Research Data Management Working Group, LIBER Steering Committee Research Infrastructures, LIBER2019, Dublin, 26 June 2019
The document summarizes a workshop hosted by the NIH Associate Director for Data Science to discuss charting the future of data science at NIH. The workshop goals were to get input from all stakeholders, identify strategic directions, policies, and funding initiatives, and have participants leave as advocates and supporters. The agenda included providing background, open discussion, identifying topics for breakout groups, subgroup discussions, and reporting back. The document provides context on current NIH data science efforts and examples of collaborators in building a biomedical research digital enterprise.
Community Capability Model Framework Checklist Tool - Demo & ReviewManjulaPatel
Presented by Manjula Patel (UKOLN, University of Bath) on 14th January 2013, Community Capability Framework for Data-Intensive Research - Applying the Model, CCMDIR Workshop, International Digital Curation Conference 2013, Amsterdam
Overview of the Research on Open Educational Resources for Development (ROER4D) Open Data initiative, highlighting data management principles, the five pillars of the ROER4D data publication approach and the project de-identification approach.
This document provides guidance on research data management for postgraduates. It discusses why research data management is important, including ensuring long-term preservation and discoverability of data, supporting more rigorous research through record keeping, and preventing academic fraud. It also outlines the key elements to include in a research data management plan, such as data type and format, storage and security, ownership and sharing. Developing a data management plan early in the research process is recommended to facilitate data preservation and sharing.
Presentation by Hugo Leroux and Liming Zhu, CSIRO, to the 'Unlocking value from publicly funded Clinical Research Data' workshop, cohosted by ARDC and CSIRO at ANU on 6 March 2019.
Research information management: making sense of it allDigital Science
"Research information management: making sense of it all" - Julia Hawks, VP North America, Symplectic
Slides from Shaking It Up: Challenges and Solutions in Scholarly Information Management, San Francisco, April 22, 2015
This document provides information about developing a data management plan for grant proposals. It discusses the goals of the workshop which are to learn about data management planning, available resources, develop a draft plan, and receive feedback. It then covers what good data management involves, who requires data management plans, examples of requirements from agencies like NSF, and parts of a generic data management plan. Finally, it discusses resources available for creating plans like the DMPTool.
Keynote presentation given at the Data Fellows 2023 workshop in Berlin on 22-23 June. Presentation gives examples of good communication to explain data management concepts and how to use games and other forms of interactivity in training events
Managing and sharing data: lessons from the European contextSarah Jones
The document discusses a presentation given by Sarah Jones on managing and sharing data openly in the European context. The presentation covered topics such as research data management (RDM), FAIR data principles, open science, the European Open Science Cloud (EOSC), and how universities can support researchers in practicing open science. It provided overviews and definitions of these topics, discussed challenges to open data sharing, and offered practical advice on making data FAIR and open through activities like choosing a license, selecting a repository, and using appropriate file formats and metadata standards.
The EOSC Association conducted a survey to gather feedback on their Multi-Annual Roadmap (MAR) and received 45 completed responses with 191 partial responses. The main themes from the 534 comments included needing more clarity on terminology, emphasizing national investment roles, and greater focus on business models and funding research software engineers. Minor comments requested removing organization examples, clarifying the voluntary nature of EOSC, and reconsidering visual identity. The analysis will be shared with the board and task forces to inform revisions to the MAR text for republication in mid-May.
The document provides an introduction to open science and the European Open Science Cloud (EOSC). It discusses the concepts of open access, open data, open methods, and FAIR data principles. It describes the EOSC as a federation of research infrastructures and services that aims to enable multidisciplinary discovery and use. Key benefits of the EOSC for researchers include access to more services, funding for compute resources, easier discovery of related data, and greater collaboration abilities.
The document summarizes the results of a consultation on the Multi-Annual Roadmap (MAR) for the European Open Science Cloud (EOSC). Over 45 people completed the survey and provided over 500 comments total. The comments covered priorities like engaging researchers, long-term data preservation, standards, and funding. The feedback will be used to update the MAR and align it with the upcoming Horizon Europe work program before publishing a new version in April.
The document provides an introduction to the European Open Science Cloud (EOSC). It defines key concepts like open science, FAIR data, and explains what EOSC is - a federated infrastructure to support open sharing and reuse of research outputs across disciplines. It outlines EOSC's goals like enabling multidisciplinary discovery and connecting previously disconnected research resources and data silos. Examples of current EOSC services and resources available via the EOSC Portal are also briefly described.
This document discusses the challenges facing the European Open Science Cloud (EOSC) and identifies actions that could help address those challenges. Some of the top challenges mentioned are that EOSC is still in the build phase and not yet functioning seamlessly for end users, it is extremely complex due to its multi-stakeholder, multi-country, and multi-disciplinary nature, and its governance was only recently established while its formation occurred organically through projects. Key priority actions identified include extensive testing and iteration based on user feedback, releasing small functionalities incrementally, continuing collaborative and consensus-driven work, and establishing an effective stakeholder forum. The document advocates for putting research community needs at the center and having the EOSC Association and
This document discusses the FAIR data principles and increasing adoption of FAIR. It begins by explaining the 15 FAIR principles for findable, accessible, interoperable and reusable data. It then discusses how adoption is increasing through funder requirements, the role of FAIR within EOSC, and related projects. However, it notes that most data is still not managed or shared according to FAIR principles due to barriers like time and effort required as well as lack of incentives and rewards. The document argues that both cultural and technical aspects must be addressed to fully implement FAIR.
Data Management Planning for researchersSarah Jones
This document provides information about creating a data management plan (DMP) for researchers. It begins with defining what a DMP is - a short plan that outlines what data will be created, how it will be managed and stored, and plans for sharing and preservation. It then discusses the common components of a DMP, including describing the data, standards and methodologies, ethics and intellectual property, data sharing plans, and preservation strategies. The document provides examples of DMP requirements and recommendations from funders. It offers tips for creating a good DMP, including thinking about the needs of future data re-users, consulting stakeholders, grounding plans in reality, and planning for sharing from the outset. Finally, it discusses tools and resources
1) Europe has invested hugely in the European Open Science Cloud (EOSC) over recent years through various initiatives, reports, and projects.
2) EOSC aims to create a federated environment for open sharing and analysis of research data across borders and disciplines.
3) Sharing sensitive data on EOSC requires properly documenting, licensing, identifying, and anonymizing data while making it findable and accessible on repositories or secure services.
Presentation given at the DMPonline 10 year anniversary week, reflecting on lessons learned developing the business model. See https://www.dcc.ac.uk/events/dmponline-10th-year-anniversary-celebration-week and #10yearsDMPonline
This document discusses best practices for supporting open science. It recommends adopting existing solutions where possible rather than developing new ones. It also suggests engaging with researchers, incentivizing open practices, allowing for innovation and failure, collaborating with peers, and keeping service delivery options open. The document concludes by inviting attendees to a workshop on delivering research data management services.
This document provides an overview of new features and updates to the DMPTuuli data management planning tool. Key points include: improvements to the user interface and sharing options; integration with ORCID and adding grant IDs; enhanced admin controls and template versioning; offering feedback on plans; and a usage dashboard and API improvements. Future planned features are also outlined such as conditional questions, custom domains, and integrations. Support resources and ways to connect with the developer are highlighted.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframePrecisely
Inconsistent user experience and siloed data, high costs, and changing customer expectations – Citizens Bank was experiencing these challenges while it was attempting to deliver a superior digital banking experience for its clients. Its core banking applications run on the mainframe and Citizens was using legacy utilities to get the critical mainframe data to feed customer-facing channels, like call centers, web, and mobile. Ultimately, this led to higher operating costs (MIPS), delayed response times, and longer time to market.
Ever-changing customer expectations demand more modern digital experiences, and the bank needed to find a solution that could provide real-time data to its customer channels with low latency and operating costs. Join this session to learn how Citizens is leveraging Precisely to replicate mainframe data to its customer channels and deliver on their “modern digital bank” experiences.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Programming Foundation Models with DSPy - Meetup Slides
Active actionable DMPs
1. Active, actionable DMPs
Sarah Jones | Digital Curation Centre | sarah.jones@glasgow.ac.uk
Stephanie Simms | California Digital Library | stephanie.simms@ucop.edu
Daniel Mietchen | National Institutes of Health | daniel.mietchen@nih.gov
Tomasz Miksa| TU Wien | tmiksa@sba-research.org
#ActiveDMPs
3. Planning &
administration
Create, analyse,
manage data
Publishing & reuse
• DMP on periphery
• Often done at grant stage and
not looked at again
• Opportunities to (re)use
information being missed
• Disconnected & unlinked
8. 47 participants from 16 countries
• Funders
• Developers
• Librarians
• Service providers
• Researchers
Understand research workflows
Develop use cases for maDMPs
Set priorities for future work
www.dcc.ac.uk/events/workshops/
postcard-future-tools-and-services-
perfect-dmp-world
Utopia workshop
9.
10. Uses cases and prioritisation
• Interoperability with research systems
• Institutional perspective
• Repository use cases
• Evaluation & monitoring
• Utilising PIDs
11. maDMP priority areas
• Common standards and protocols
• Funder integration
• Share/publish/deposit DMPs
• Utilise PIDs for automatic reporting
• Capacity planning (institutional & data centre)
• Automated compliance checks
12. The problem of freetext
Templates typically ask very broad questions, even when
dropdown options are feasible (e.g. metadata standards, file
formats, data volumes, repositories, licences…)
13. Plugins to give structured response
http://rd-alliance.github.io/metadata-directory
14. Define a minimum data model
• DMPRoadmap themes
• Mappings
• Common format?
From Flickr by Steve Johnson, CC BY 2.0
16. Repository use cases
I. Repository recommender
service via re3data.org
II. Text mine to ping repositories
when mentioned in a DMP
III. Use DMP as metadata to
facilitate deposit process
IV. Deposit DMPs with data Ainsley Seago. CC BY 4.0
17. Institutional use cases
I. Connect researchers to
relevant services & support
II. Gather information to
forecast demand and do
capacity planning
III. Embed DMP in research
process (domain workflows,
ethics, admin systems)
William Murphy. CC BY-SA 4.0
18. Persistent identifiers (PIDs)
• Assign DOIs to releases of DMP versions
• Leverage other PIDs to populate DMP over time:
–Researcher IDs (ORCIDs)
–Funder IDs (FundRef)
–Grant IDs
–Research Resource IDs (RRIDs)
• antibodies, organisms, cell lines
• Also enables compliance monitoring http://pidapalooza.org
20. Utilising EC grant IDs in plans
• Harvest grant IDs from OpenAIRE API
• Provide look up when entering project details
• Enables join up of DMP with other outputs
21. Evaluation & monitoring
• Automated compliance checks
– did researchers do what they said they would?
• Quality or validation checks
– closed questions / range of defined options
– training and evaluation rubrics
– evaluate FAIRness of data and repository…
23. Summary
Think of DMPs as key elements of
a networked data management
ecosystem:
– connected via a shared
vocabulary
– actionable by humans and
software
– versioned
– public
From Flickr by highwaysengland, CC BY 2.0
24. Join us for more!
Thurs 6th April, 9:30-11:00, @RDA Plenary in BCN, Active DMP IG session
Editor's Notes
Thanks to Jamie for invite to speak here today.
I’m going to reflect on some work that we’ve been doing at DCC in collaboration with our partners at UC3, as well as Daniel Mietchen from NIH and Tomasz Miksa from TU Wien.
DCC and UC3 have joined up over last year to converge on a single codebase (DMPRoadmap) to operate both of our tools – DMPonline and DMPTool. These are the two main DMP platforms worldwide and have thousands of users so provide a good basis for implementing and road testing active DMPs.
We’ve all been thinking about how we can make the most of DMPs so information is shared between systems, and the experience of producing DMPs becomes much more valuable and returns benefits to all involved
So I want to begin by thinking about the current system and how it is flawed. Most researchers see the DMP as yet another hurdle to overcome, an obstacle in the race to get grant funding.
In the UK we have a very compliance heavy landscape and it’s not always clear to what extent DMPs are being monitored so it encourages researchers to just pay lip service to the requirements
When we think about how DMPs fit into the overall research lifecycle and the various systems and processes being followed, all too often they’re on the periphery, something that is done at application stage and then not looked at again. Although funders and others talk about DMPs being actively updated, there’s little to encourage researchers to actually engage with this
Fortunately there is a lot of interest in changing and improving things. Funders are aware that they gather a lot of valuable information in DMPs and are missing opportunities to use this. It could help with data discovery for example to promote reuse of the published outputs. Some funders (Wellcome / NSF) are considering pilots in this area
Another key issue at the moment is that DMPs are disconnected from other systems. A lot more information could be exchanged and reused
Where we hope to get to eventually is to connect up DMPs with the wide range of systems researchers are using in the course of their work, so they act as more of a hub. This includes:
Research Information Management systems like Pure or Symplectic elements so administrative data from research offices can be pushed into DMPs and vice versa
Funder application systems like Je-S and the EC’s participants portal. While it’s very difficult to integrate with these, APIs could hopefully exchange some information and make the process more streamlined for researchers
Lab notebook systems like Jupyter
Storage and data management platforms like Dataverse and OSF
Journals like RIO and BMC research notes for publishing DMPs
Repositories (e.g. zenodo & figshare) to deposit DMPs alongside datasets and other outputs
Identifiers also play a key role to track assertions about people/organisations/grants etc, to trigger notifications and automate reporting activities. There are many different identifier systems that could be leveraged in DMPs e.g. fundref, RRID, DOIs, ORCIDs…
We also need to think about how we should exchange the information across systems e.g. with APIs, defining open format/protocol/standard
Building on these thoughts, I wanted to share a couple of slides that Tomasz Miksa presented at IDCC a few weeks ago. He put forward a vision for DMPs, where all the information you need is in one place and it interacts with various tools, standards and systems.
Tomasz has started using the DCC Checklist for a DMP to map across the sections of a DMP and existing tools and standards
If you’re doing work on machine-actionable DMPs or think you have something to bring, we encourage you to join up and collaborate with us. We’re engaging in a number of international fora e.g. RDA and FORCE11 to ensure there’s consensus on the way forward. We already see a lot of benefits from pooling our knowledge and bringing together people from different backgrounds and countries, so do join in and help us solve this together.
With those thoughts of collaboration in mind, we recently held a workshop at IDCC to gather thoughts on where we should be heading.
We dubbed this our utopia workshop. We didn’t want people to be constrained by what’s feasible now or in the short-term. We really wanted people to consider what would be optimum for all stakeholders involved.
We were hugely oversubscribed so couldn’t accommodate everyone at the workshop. We ended up with 47 people in total from 16 different countries so it was a very international mix. We also had representation from various stakeholder groups…
We asked people to map out their activities and research workflows to see where things connect, to develop specific use cases and set priorities for future work
So everyone got very hands-on. You can see here one of the examples of visualising workflows and connections
And some of the use cases and prioritisation:
Interoperability across research systems was the most popular topic, with a few groups discussing connections between different platforms
A number addressed institutional use cases, consider how universities could support researchers with DMPs
The repository use cases again considered integrations. You can see in the photo the desired two way exchange of info (DMPs alerting repos to data, and repos pushing info on datasets and DOIs back into DMPs)
Evaluation and monitoring was a topic of interest, particularly among funders
And the PID group also generated a lot of tangible starting points
We’re writing up full notes from the workshop which we expect to release to participants in the next week and then publish in RIOjournal as a white paper.
To give you a sneak preview, a number of priority areas came out across the groups:
All stakeholders expressed a need for common standards as a foundation to enable information to flow between plans and systems. It’s a top priority to define a minimum data model with a core set of elements for DMPs. It could potentially be based on an existing template structure or DMPRoadmap themes
Given that funders often drive demand for DMPs, there’s a lot of demand to integrate with their systems on some level. Basic interoperation to support grant submission, monitoring and reporting would help.
An increasing number of researchers are publishing DMPs and we want to support open practices. Connections with journals or repositories would help here.
PIDs could be used in several ways to pull in relevant data and also help automate reporting
Harvesting information on data volumes was critical in an institutional context and for domain data centres so they can do capacity planning exercises and ensure support is costed in
Some automation of compliance checks is also desired e.g. checking that data has been deposited in the named repository
So I now want to tease into some of the issues in more detail and give you a sense of some work that is underway, others things that are planned or activities we hope to do.
One of the primary issues we face in terms of machine-actionability is the format that DMPs are in. They are typically free-form text documents as funders and others tend to ask very broad questions, even when a limited range of options could be provided.
One things we plan to do is draw in external resources such as the RDA Metadata Standards Directory and Biosharing. This will have a number of benefits: it will help direct researchers to relevant options as they’re not always aware of good practice, and it will mean that some more structured responses are provided for future action.
There are a number of databases e.g. re3data, or wizards like the EUDAT licencing tool that could be used in this way
One of the key priorities coming out of the workshop is to define a minimum data model for DMPs
The DCC did some work a few years back to analyse requirements and good practice to derive a basic checklist for a DMP and common themes. We’ve since revised these themes in collaboration with UC3 to test that they meet the US and UK landscapes.
Currently the themes are used to associate questions and guidance within the tool, but we want to explore the potential for basic tagging (e.g. for text mining sections of DMPs)
We are also working with repositories to test mapping themes to other vocabularies e.g. DDI working group for social science repos
These themes may also form the basis of a common format
We are introducing a new text editor into the DMPRoadmap platform (the codebase from which we will operate DMPonline and DMPTool)
Substance Forms is a Javascript library which will provide a simpler, cleaner way to write plans and annotate text with comments
In future, we hope the editor can also support us to mine the text for expected terms e.g. repository names
To move on to some of the specific use cases, there are a number related to repos…
Institutions were keen to use the DMP process to connect researchers up with relevant services & support. They see the DMP as a good awareness raising and training tool.
Capacity planning is also a key use case for unis, specifically to ensure costs are written into proposals
And there’s a desire to link up across various systems in use as noted in the introductory slides. Some institutions like Purdue University are keen to be institutional pilot and map out all the information flows across different research systems so we can have a picture of a whole university as an example.
Add IDs to the DMP, forward recognition = living, updatable document
Make the point that there are many kinds of IDs, can leverage them in various actionable ways
Can be about assigning DOIs to DMPs and/or using DOIs in DMPs. Goal to enable tracking of how commitments made to funders are followed through into actions to publish/deposit outputs.
Also note Stephanie’s presentation from PIDapalooza
We have done an ORCID integration so users can connect up their DMP profile. We hope to do more with this e.g. login with ORCID and use it to bring in other relevant info
Grant IDs can also be pulled into DMPs. For the EC for example, we could use the OpenAIRE API to harvest these and provide a look up when researchers are creating their plan at month 6. This also allows us to connect up the DMP with other outputs to assist in reporting
In terms of evaluation and monitoring, funders are particularly interested in automated compliance checks, and also support for plan review. More closed questions or predefined options would help in assessment, as well as basic tools like evaluation rubrics and training to support project officers. The other aspect mentioned in terms of H2020 was an evaluation of the FAIRness of data and repositories, which is a nice handover to Ingrid who’s up next…
Our next steps are to develop some of the use cases and test these out in the roadmap platform. Everyone from the workshop wanted to be a pilot user so we have lots of volunteers. The only limiting factor will be our bandwidth to run them all!
Summary of the overall vision
Quick note about promoting greater openness and public sharing of DMPs:
DMPTool Public DMPs list
DMP Collection in RIO Journal
Depositing DMPs in Zenodo, Dataverse and other IRs
Also plan to continue this work at the Barcelona plenary session in April – join us there!