SlideShare a Scribd company logo

A Big Picture in Research Data Management

A personal view of the big picture in Research Data Management, given at GFBio - de.NBI Summer School 2018 Riding the Data Life Cycle! Braunschweig Integrated Centre of Systems Biology (BRICS), 03 - 07 September 2018

1 of 92
Download to read offline
A Big Picture in
Research Data Management
Carole Goble
The University of Manchester
Head of Node: ELIXIR-UK
Coordinator: FAIRDOM
Chair RDM User Group: University of Manchester
carole.goble@manchester.ac.uk
GFBio - de.NBI Summer School 2018 Riding the Data Life Cycle!
Braunschweig Integrated Centre of Systems Biology (BRICS)
03 - 07 September 2018
Open Science
Open Data
Reuse Science
Reproducible Science
Personally Productive Science
Governments
spend a lot of
public money on
research
Much (all?) of it
uses data or
generates data or
both.
Vahan Simonyan,
Center for Biologics Evaluation
and Research
Food and Drug Administration
USA
Stodden, Seiler, Ma. An empirical analysis of journal policy effectiveness for computational
reproducibility, PNAS March 13, 2018. 115 (11) 2584-2589;
https://doi.org/10.1073/pnas.1708290115
Since 2011
sharing/publishing assets in public archives…
Data Models
*top three most popular
The evolution of standards and data management practices in systems biology
(2015). Stanford et al, Molecular Systems Biology, 11(12):851

Recommended

RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsCarole Goble
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Carole Goble
 
FAIR History and the Future
FAIR History and the FutureFAIR History and the Future
FAIR History and the FutureCarole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 

More Related Content

What's hot

Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsCarole Goble
 
Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...librarianrafia
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community UpdateCarole Goble
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects Carole Goble
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)Carole Goble
 
Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher? Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher? Carole Goble
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceCarole Goble
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryCarole Goble
 
Data management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryData management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryCarole Goble
 
Scientific Workflows: what do we have, what do we miss?
Scientific Workflows: what do we have, what do we miss?Scientific Workflows: what do we have, what do we miss?
Scientific Workflows: what do we have, what do we miss?Paolo Romano
 
Building collaborative workflows for scientific data
Building collaborative workflows for scientific dataBuilding collaborative workflows for scientific data
Building collaborative workflows for scientific dataBruno Vieira
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...Carole Goble
 
ELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardCarole Goble
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOMCarole Goble
 
LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Europe
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsCarole Goble
 

What's hot (20)

Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of Scientists
 
Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community Update
 
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
 
Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher? Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher?
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
 
Data management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryData management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK Story
 
Scientific Workflows: what do we have, what do we miss?
Scientific Workflows: what do we have, what do we miss?Scientific Workflows: what do we have, what do we miss?
Scientific Workflows: what do we have, what do we miss?
 
Building collaborative workflows for scientific data
Building collaborative workflows for scientific dataBuilding collaborative workflows for scientific data
Building collaborative workflows for scientific data
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
 
ELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR Board
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
 
LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?LIBER Webinar: Are the FAIR Data Principles really fair?
LIBER Webinar: Are the FAIR Data Principles really fair?
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research Commons
 
FAIR data overview
FAIR data overviewFAIR data overview
FAIR data overview
 

Similar to A Big Picture in Research Data Management

Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data ChallengesPhilip Bourne
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314Philip Bourne
 
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...Philip Bourne
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeLizLyon
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceLizLyon
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsVivien Bonazzi
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewPhilip Bourne
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environmentphilipdurbin
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesRebekah Cummings
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...Carole Goble
 
Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018 Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018 Clare Dean
 

Similar to A Big Picture in Research Data Management (20)

Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data Challenges
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Yale Day of Data
Yale Day of Data Yale Day of Data
Yale Day of Data
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314
 
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decade
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalface
 
AMIA 2014
AMIA 2014AMIA 2014
AMIA 2014
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
 
Simon hodson
Simon hodsonSimon hodson
Simon hodson
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and Humanities
 
Data!
Data!Data!
Data!
 
Jonathan Breeze, Symplectic
Jonathan Breeze, SymplecticJonathan Breeze, Symplectic
Jonathan Breeze, Symplectic
 
BLC & Digital Science: Jonathan Breeze, Symplectic
BLC & Digital Science: Jonathan Breeze, SymplecticBLC & Digital Science: Jonathan Breeze, Symplectic
BLC & Digital Science: Jonathan Breeze, Symplectic
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
 
Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018 Metadata 2020 Vivo Conference 2018
Metadata 2020 Vivo Conference 2018
 

More from Carole Goble

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...Carole Goble
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...Carole Goble
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a VillageCarole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learningCarole Goble
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...Carole Goble
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpCarole Goble
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Carole Goble
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpCarole Goble
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerCarole Goble
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better ResearchCarole Goble
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsCarole Goble
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceCarole Goble
 

More from Carole Goble (13)

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a Village
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learning
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can help
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects help
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic career
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 

Recently uploaded

Volatile Oils-Introduction for pharmacy students and graduates
Volatile Oils-Introduction for pharmacy students and graduatesVolatile Oils-Introduction for pharmacy students and graduates
Volatile Oils-Introduction for pharmacy students and graduatesAhmed Metwaly
 
discussion on the endocrine system for science grade10.pptx
discussion on the endocrine system for science grade10.pptxdiscussion on the endocrine system for science grade10.pptx
discussion on the endocrine system for science grade10.pptxShePerezDelaCruz
 
Weak-lensing detection of intracluster filaments in the Coma cluster
Weak-lensing detection of intracluster filaments in the Coma clusterWeak-lensing detection of intracluster filaments in the Coma cluster
Weak-lensing detection of intracluster filaments in the Coma clusterSérgio Sacani
 
Introduction to Chromatography (Column chromatography)
Introduction to Chromatography (Column chromatography)Introduction to Chromatography (Column chromatography)
Introduction to Chromatography (Column chromatography)Ahmed Metwaly
 
Presentacion Mariana Arango- biología molecular
Presentacion Mariana Arango- biología molecularPresentacion Mariana Arango- biología molecular
Presentacion Mariana Arango- biología molecularmarianaarangop
 
Chemical Bonding and it's Types 001.pptx
Chemical Bonding and it's Types 001.pptxChemical Bonding and it's Types 001.pptx
Chemical Bonding and it's Types 001.pptxperiyar arts college
 
Elbow joint - Anatomy of the Elbow joint
Elbow joint - Anatomy of the Elbow jointElbow joint - Anatomy of the Elbow joint
Elbow joint - Anatomy of the Elbow jointTELISHA2
 
The ExoGRAVITY project - observations of exoplanets from the ground with opti...
The ExoGRAVITY project - observations of exoplanets from the ground with opti...The ExoGRAVITY project - observations of exoplanets from the ground with opti...
The ExoGRAVITY project - observations of exoplanets from the ground with opti...Advanced-Concepts-Team
 
Open Access Publishing in Astrophysics and the Open Journal of Astrophysics
Open Access Publishing in Astrophysics and the Open Journal of AstrophysicsOpen Access Publishing in Astrophysics and the Open Journal of Astrophysics
Open Access Publishing in Astrophysics and the Open Journal of AstrophysicsPeter Coles
 
Seminario biología molecular Lina Charris
Seminario biología molecular Lina CharrisSeminario biología molecular Lina Charris
Seminario biología molecular Lina CharrisLinaMarcelaCharrisRa
 
Hydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oil
Hydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oilHydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oil
Hydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oilZeeshan Nazir
 
Quasar and Microquasar Series - Microquasars in our Galaxy
Quasar and Microquasar Series - Microquasars in our GalaxyQuasar and Microquasar Series - Microquasars in our Galaxy
Quasar and Microquasar Series - Microquasars in our GalaxySérgio Sacani
 
electrophoresis: types, advantages, disadvantages and applications.
electrophoresis: types, advantages, disadvantages and applications.electrophoresis: types, advantages, disadvantages and applications.
electrophoresis: types, advantages, disadvantages and applications.Silpa Selvaraj
 
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022Nan Yang Academy of Sciences
 
Open Access Publishing and the Open Journal of Astrophysics
Open Access Publishing and the Open Journal of AstrophysicsOpen Access Publishing and the Open Journal of Astrophysics
Open Access Publishing and the Open Journal of AstrophysicsPeter Coles
 
REJUVENATION THROUGH PROGENY ORCHAD AND SCION BANK
REJUVENATION THROUGH PROGENY ORCHAD AND SCION BANKREJUVENATION THROUGH PROGENY ORCHAD AND SCION BANK
REJUVENATION THROUGH PROGENY ORCHAD AND SCION BANKAmanDohre
 
Genetic Code. A comprehensive overview..pdf
Genetic Code. A comprehensive overview..pdfGenetic Code. A comprehensive overview..pdf
Genetic Code. A comprehensive overview..pdfmughalgumar440
 
Endocrine.pptx Organs/Glands in Endocrine System
Endocrine.pptx Organs/Glands in Endocrine SystemEndocrine.pptx Organs/Glands in Endocrine System
Endocrine.pptx Organs/Glands in Endocrine SystemJhonatanGarciaMendez
 
Anti-Obesity Activity of Anthocyanins and Corresponding Introduction in Dieta...
Anti-Obesity Activity of Anthocyanins and Corresponding Introduction in Dieta...Anti-Obesity Activity of Anthocyanins and Corresponding Introduction in Dieta...
Anti-Obesity Activity of Anthocyanins and Corresponding Introduction in Dieta...AmalDhivaharS
 

Recently uploaded (20)

Volatile Oils-Introduction for pharmacy students and graduates
Volatile Oils-Introduction for pharmacy students and graduatesVolatile Oils-Introduction for pharmacy students and graduates
Volatile Oils-Introduction for pharmacy students and graduates
 
discussion on the endocrine system for science grade10.pptx
discussion on the endocrine system for science grade10.pptxdiscussion on the endocrine system for science grade10.pptx
discussion on the endocrine system for science grade10.pptx
 
REGULATION OF METABOLISM IN PLANTS AND THE DIFFERENT MECHANISMS
REGULATION OF METABOLISM IN PLANTS  AND THE DIFFERENT MECHANISMSREGULATION OF METABOLISM IN PLANTS  AND THE DIFFERENT MECHANISMS
REGULATION OF METABOLISM IN PLANTS AND THE DIFFERENT MECHANISMS
 
Weak-lensing detection of intracluster filaments in the Coma cluster
Weak-lensing detection of intracluster filaments in the Coma clusterWeak-lensing detection of intracluster filaments in the Coma cluster
Weak-lensing detection of intracluster filaments in the Coma cluster
 
Introduction to Chromatography (Column chromatography)
Introduction to Chromatography (Column chromatography)Introduction to Chromatography (Column chromatography)
Introduction to Chromatography (Column chromatography)
 
Presentacion Mariana Arango- biología molecular
Presentacion Mariana Arango- biología molecularPresentacion Mariana Arango- biología molecular
Presentacion Mariana Arango- biología molecular
 
Chemical Bonding and it's Types 001.pptx
Chemical Bonding and it's Types 001.pptxChemical Bonding and it's Types 001.pptx
Chemical Bonding and it's Types 001.pptx
 
Elbow joint - Anatomy of the Elbow joint
Elbow joint - Anatomy of the Elbow jointElbow joint - Anatomy of the Elbow joint
Elbow joint - Anatomy of the Elbow joint
 
The ExoGRAVITY project - observations of exoplanets from the ground with opti...
The ExoGRAVITY project - observations of exoplanets from the ground with opti...The ExoGRAVITY project - observations of exoplanets from the ground with opti...
The ExoGRAVITY project - observations of exoplanets from the ground with opti...
 
Open Access Publishing in Astrophysics and the Open Journal of Astrophysics
Open Access Publishing in Astrophysics and the Open Journal of AstrophysicsOpen Access Publishing in Astrophysics and the Open Journal of Astrophysics
Open Access Publishing in Astrophysics and the Open Journal of Astrophysics
 
Seminario biología molecular Lina Charris
Seminario biología molecular Lina CharrisSeminario biología molecular Lina Charris
Seminario biología molecular Lina Charris
 
Hydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oil
Hydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oilHydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oil
Hydro-Thermal Liquefaction Of Lignocellulosic biomass to produce Bio-Crude oil
 
Quasar and Microquasar Series - Microquasars in our Galaxy
Quasar and Microquasar Series - Microquasars in our GalaxyQuasar and Microquasar Series - Microquasars in our Galaxy
Quasar and Microquasar Series - Microquasars in our Galaxy
 
electrophoresis: types, advantages, disadvantages and applications.
electrophoresis: types, advantages, disadvantages and applications.electrophoresis: types, advantages, disadvantages and applications.
electrophoresis: types, advantages, disadvantages and applications.
 
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
Earth and Planetary Science | Volume 01 | Issue 01 | April 2022
 
Open Access Publishing and the Open Journal of Astrophysics
Open Access Publishing and the Open Journal of AstrophysicsOpen Access Publishing and the Open Journal of Astrophysics
Open Access Publishing and the Open Journal of Astrophysics
 
REJUVENATION THROUGH PROGENY ORCHAD AND SCION BANK
REJUVENATION THROUGH PROGENY ORCHAD AND SCION BANKREJUVENATION THROUGH PROGENY ORCHAD AND SCION BANK
REJUVENATION THROUGH PROGENY ORCHAD AND SCION BANK
 
Genetic Code. A comprehensive overview..pdf
Genetic Code. A comprehensive overview..pdfGenetic Code. A comprehensive overview..pdf
Genetic Code. A comprehensive overview..pdf
 
Endocrine.pptx Organs/Glands in Endocrine System
Endocrine.pptx Organs/Glands in Endocrine SystemEndocrine.pptx Organs/Glands in Endocrine System
Endocrine.pptx Organs/Glands in Endocrine System
 
Anti-Obesity Activity of Anthocyanins and Corresponding Introduction in Dieta...
Anti-Obesity Activity of Anthocyanins and Corresponding Introduction in Dieta...Anti-Obesity Activity of Anthocyanins and Corresponding Introduction in Dieta...
Anti-Obesity Activity of Anthocyanins and Corresponding Introduction in Dieta...
 

A Big Picture in Research Data Management

  • 1. A Big Picture in Research Data Management Carole Goble The University of Manchester Head of Node: ELIXIR-UK Coordinator: FAIRDOM Chair RDM User Group: University of Manchester carole.goble@manchester.ac.uk GFBio - de.NBI Summer School 2018 Riding the Data Life Cycle! Braunschweig Integrated Centre of Systems Biology (BRICS) 03 - 07 September 2018
  • 2. Open Science Open Data Reuse Science Reproducible Science Personally Productive Science
  • 3. Governments spend a lot of public money on research Much (all?) of it uses data or generates data or both.
  • 4. Vahan Simonyan, Center for Biologics Evaluation and Research Food and Drug Administration USA
  • 5. Stodden, Seiler, Ma. An empirical analysis of journal policy effectiveness for computational reproducibility, PNAS March 13, 2018. 115 (11) 2584-2589; https://doi.org/10.1073/pnas.1708290115 Since 2011
  • 6. sharing/publishing assets in public archives… Data Models *top three most popular The evolution of standards and data management practices in systems biology (2015). Stanford et al, Molecular Systems Biology, 11(12):851
  • 7. NIH Rigor and Reproducibility https://www.nih.gov/research- training/rigor-reproducibility Plenty of advice cos.io/top
  • 8. Plenty of Funder Data Policies http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
  • 9. Pontika et al, Fostering Open Science to Research using a Taxonomy and an eLearning Portal at iKnow: 15th International Conference on Knowledge Technologies and Data Driven Business, http://dx.doi.org/10.1145/2809563.2809571 Open Science Taxonomy
  • 10. https://wellcomeopenresearch.org/ Nature Scientific Data Data Publishing and Citation http://www.scholix.org/ https://datacite.org/ https://www.force11.org/datacitationprinciples https://www.nature.com/sdata/
  • 11. “The FAIR Guiding Principles for scientific data management and stewardship Scientific Data 3, 160018 (2016) doi:10.1038/sdata.2016.18 Principles Metadata Identifiers Access policies Standards Technical: Political Social Economic: A rallying cry ….
  • 14. Research Data Management Retain (or dispose) Review (replicate & validate) Reproduce (verify, compare) By the researcher and their collaborators By their peers, the public and competitors (include, combine)
  • 15. Fifty Shades of FAIR Workflows SOPs Containers, cloud services, common services Packaging platforms (Research Objects) Markup languages, reporting guidelines and checklists, ontologies, catalogues Sounds hard….Catalogues Search markup
  • 16. …. RDM Lifecycles CollectionSharing Stewardship Integration Primary & secondary data, models, SOPs Metadata Experimental context Integration with in house data infrastructuresFAIR Organise & link assets Standardised, consistent reporting Reproducible publications Yellow pages Exchange among colleagues How and when to share and publish Get and give credit Retain and find beyond project Span across legacy, in house, external systems, community archives Integrate with tools, analysis platforms, in house data infrastructures Curation support Capacity building Metadata practices Policies and governance Knowing what to throw away
  • 17. …. Curation Lifecycles + RDM Lifecycles https://www.nrel.colostate.edu/the-data-lifecycle-part-1-data-management-for-open-access-5- questions-to-ask-about-your-data/
  • 19. Do Research Research Infrastructure Services Assemble Methods, Materials Experiment ObserveSimulate Analyse Results Quality Assessment Track and Credit Disseminate Deposit & Licence Marketplace Services Publish Share Results Any research product Selected products Manage Results Science 2.0 Repositories: Time for a Change in Scholarly Communication Assante, Candela, Castelli, Manghi, Pagano, D-Lib 2015 Science 2.0 Repositories
  • 20. 101 Innovations in Scholarly Communication - the Changing Research Workflow, Boseman and Kramer, 2015, http://figshare.com/articles/101_Innovations_in_Scholarly_Communication_the_Changing_Research_Workflow/1286826 A RDM Ecosystem
  • 21. Team Science …….Of Individuals Collaborating and Competing Simultaneously Self-deposit, self-curating, variable stewardship skills The RDMTeam… A RDM Egosystem
  • 22. FAIR RDM in the Team multi-partner, multi-disciplinary projects What methods are been used to determine enzyme activity? What SOP was used for this sample? Where is the validation data for this model? Is there any group generating kinetic data? Is this data available? Track versions of my model Whats the relationship between the data and model? Which data belong to which publications?
  • 25. Project Managed Spaces: Organisation -> Sharing -> Dissemination Project Investigation Programme Self-controlled spaces managed spaces One entry point over external systems A Project Commons
  • 26. X = data, software, method, article I can access your X Your X is (re)usable by me and with my tools/data I get credit for using your X You can’t use my X Only access/use my X if I say so I don’t have resources and skills to make my X reusable and reproducible I must get credit if you use X Someone else will paying for X stewardship and archiving. X will always be there & free for me. Maturing this view. FAIR RDM outside the Team
  • 27. “Getting it published, not getting it right” Matt Spitzer, COS, Jisc-CNI Leadership Conference 2018 Reuse Debt Annotate for strangers Organise Share Disseminate Data decreases Metadata increases Reach increases • Metadata quality and quantity • Identifier hygiene
  • 28. me ME my team close colleagues peers Access Spiral: Staged sharing organisation – collaboration - dissemination The number of assets reduces Reach of sharing increases The richness of metadata needed increases Burden of work increases
  • 29. Data ScienceAnalytics Machine learning Discovery, New algorithms Data stewardship Standardisation, Harmonisation, Annotation and enrichment, Maintaining access, preserving Software stewardship Updates, versions, porting Prep & Processing Data wrangling & curation Instrument pipelines Simulation sweeps
  • 30. Personal Productivity reviewers want additional work statistician wants more runs analysis needs to be repeated post-doc leaves, student arrives new/revised datasets updated/new versions of algorithms/codes sample was contaminated better kit - longer simulations new partners, new projects Means educating PIs and Supervisors Personal Productivity Retention, reuse Publish driven Public Good Sharing & Reproducibility Access driven
  • 31. Favourite excuses … The results are embedded in a figure in the paper I don’t know where the data is You can have it but the metadata is so bad you will need me to interpret it You can have it but only if you put me on your paper Pseudo Sharing Data Flirting Data Hugging The Reward Norms of Science… more later You won’t credit me or cite my data but you’ll demand work from me and use it for your own research reputation… Don’t have the resources or skills You will ask me questions
  • 32. RDM Stakeholders data managers librarians, IT admin Global Enterprises Standards, International Research Infrastructures RDM
  • 33. Capitalising on investments Retaining results post-project Pooling, transfer, sharing results Public collections Skilling workforce Compliance audit/metrics Community productivity Reproducibility Productivity Doing science with collaborators Publishing & getting credit Access to resources, results, collections Retention of my results post student Repeatability - reviewer wants more  Competitiveness, protecting assets Managing costs Compliance StakeholderAccountabilityValues overlaps, mismatches? Stakeholder Agendas New publishable assets Business models Reproducibility
  • 34. Knowledge Exchange Report: http://www.knowledge-exchange.info/event/ke-approach-open-scholarship RDM Knowledge Exchange Public Good Private Good Institutional Facility Community Organisation’s Good National centres Publishers, Funders Policy makers, Government Public archives Shared Infrastructure Shared Data Centres Global National Researcher Personal Researchers Trainers Students PIs Lab books Group infrastructure Data managers Lab managers Libraries Institutional repositories
  • 35. republic of science* regulation of science *Merton’s four norms of scientific behaviour (1942)
  • 36. Publishing in Public Central Repository Repertoire Stanford et alThe evolution of standards and data management practices in systems biology, Molecular Systems Biology (2015) 11: 851 DOI 10.15252/msb.20156053 Stanford et alThe evolution of standards and data management practices in systems biology, Molecular Systems Biology (2015) 11: 851 DOI 10.15252/msb.20156053
  • 37. The RDM Ecosystem • public collections & archives • data centres • journals • Institutional repositories • most researchers • labs & universities • my resources Stanford et alThe evolution of standards and data management practices in systems biology, Molecular Systems Biology (2015) 11: 851 DOI 10.15252/msb.20156053
  • 40. Global & National RDM Global “Moonshot” Projects NIH Data Commons Standards Organisations International Organisations
  • 42. Services & Activities Training CommunitiesPolicy Data,Tools, Compute, Interoperability Engage European International National Industry domains technologiestechniques
  • 43. RDM select, support, and sustain public and national data resources support development of new ones CDRs DDs NDRs support and advocate for standards, their adoption and provide support services Identifiers.org run registries, discovery and analysis tools coordinate integration efforts BioTools support researchers for their data management: training, DMP, infrastructure, consultancy by nodes for nodes in their national settings Nodes
  • 44. 1k+ Databases 1k+ Standards 100+ Policies https://dsw.fairdata.solutions Data Stewardship Wizard Practice identifier hygiene A unique identifier for each record 800+ data collections 10 Rules for Identifiers 10 Rules for Selecting a BioOntology 200+ Ontologies https://www.ebi.ac.uk/ols https://doi.org/10.1371/journal.pbio.2001414 https://doi.org/10.1371/journal.pcbi.100743
  • 46. A trusted virtual environment to store, share & re- use research information. Reduce reinvention. Avoid duplication Simplify access. Support interdisciplinary re-use. Serve Europe's 1.7 million researchers (of all disciplines) and 70 million science and technology professionals Open Science Move, share and re-use data seamlessly • across global markets and borders • among institutions and research disciplines • trusted free flow of data • data infrastructure to store and manage data • high-speed connectivity to transport data • High Performance Computers to process data Realising the EOSC doi:10.2777/940154
  • 47. eucli d Pan-European e-Infrastructures Research Infrastructures HPC Centres of Excellence NationalRegional e-Infrastructures Policy and Best Practice NationalLocal Research Infrastructures Integration Projects Thematic e-Infrastructures [Per Oster]
  • 48. Dataandtoolsfromcontributors NationalNodes,Sitemonitoring Community oriented Integration [Based on Massimo Cocco, ENVRI] e-Infrastructures Cloud Research Infrastructures Commons
  • 49. A Research Commons? collectively created, owned and shared, with governance “… a cloud-based platform where investigators can store, share, access, and interact with digital objects (data, software, etc.) generated from …. research. By connecting the digital objects and making them accessible, the Data Commons is intended to allow novel scientific research that was not possible before, including hypothesis generation, discovery, and validation.” https://commonfund.nih.gov/commons Pooled Resources Federation Access NIH Data Commons
  • 50. • Overcoming fragmentation – Across scattered resources, platforms, people • Improving flow of information – Coordination, collaboration • Cumulative, dynamic [original figure: Josh Sommer] Cumulative A Commons Goble, De Roure, Bechhofer, Accelerating KnowledgeTurns, I3CK, 2013, isbn: 978-3-642-37186-8 http://fora.tv/2010/04/23/Sage_Commons_Josh_Sommer_Chordoma_Foundation
  • 51. multi-object multi-repositories Experimental context All together Type specific archives Fragmented silos Models Presentations events Articles Workflows Samples metadata Data StandardOperating Proceduresversion, tracking provenance parameters citation
  • 52. De-contextualised Static, Fragmented Lost Semantic linking Contextualised Active, Unified Semantic linking Buried in a PDF figure Reading and Writing Scattered…. Fragmented Dissemination
  • 53. 3 Studies Model analysis, construction, validation 24 Assays/Analysis Simulations, characterisations 16 19 13 2 1 Structured organisation Retain context in one place Deposit in the fragmented resources [Penkler, Snoep]
  • 54. FAIRDOMHub : A Federated “Virtual” Data Commons based on aggregation http://fairdomhub.org External Databases In House Stores Secure Stores Modelling Resources Distributed Commons, Integrated View Analytical Resources In progress
  • 56. Knowledge Exchange Report: http://www.knowledge-exchange.info/event/ke-approach-open-scholarship project based asset management and collaboration (inter)national archives and infrastructuresAutomated deposition & harvesting institutional repositories and infrastructures Federation Standardised hygienic identifiers Standardised metadata exchange Standardised protocol/APIs
  • 57. Data-Literature Interoperability evolving lightweight set of guidelines http://www.scholix.org/
  • 58. Standardised metadata mark-up Metadata published & harvested withoutAPIs or special feeds Commodity Off the Shelf tools App eco-system schema.org tailored to the Biosciences for FAIR simple structured metadata markup on web pages & sitemaps MarRef Marine Metagenomics Database BioSamples Deposition Database Metadata Federation & SEARCH of course!
  • 59. The First and Last Mile “ramps” onto the Research Data Infrastructures FAIR data at source – data deposition, validation and upload pipelines into public repositories FAIR access from my tools Bench Benefit The ‘last mile’ challenge for European research e-infrastructures https://doi.org/10.3897/rio.2.e9933 EOSC Harvesting Templates Automation Tracking pipelines Notebooks Spreadsheet wrangling Data2Paper Data Tracking Sheets
  • 60. https://ncip.nci.nih.gov/blog/face-new-tragedy-commons-remedy-better-metadata/ “Creating good metadata takes considerable work …. when investigators act in their own self-interest, taking short cuts to generate metadata as quickly as possible, we should expect that the overall utility of the resource will decline. … a need for easy-to-use solutions that are generic to provide guidance over the entire life cycle of metadata — streamlining metadata creation, discovery, and access, as well as supporting metadata publication to third-party repositories” Mark Musen Stanford The First Mile: Metadata at Source Reduce complexity
  • 61. Specialist databases Local Biochem4j ICE Global Brenda, wikipathways, Biomodels ICE Public Deposition Databases Public Catalogues Tracking in Specialist Systems Institutional Catalogue & Repository Scientists workflow drives the RDM workflow, not the other way round…… “metadata transaction tools”
  • 63. Research Infrastructure Services Assemble Methods, Materials Experiment ObserveSimulate Analyse Results Quality Assessment Track and Credit Disseminate Deposit & Licence Marketplace Services Share Results Manage Results Building a FAIR Research Commons Science 2.0 Repositories:Time for a Change in Scholarly Communication Assante, Candela,Castelli, Manghi, Pagano DOI: 10.1045/january2015-assante Mesirov,J. Accessible Reproducible Research Science 327(5964), 415-416 (2010) Born FAIR Elsewhere on-date Within during
  • 64. Research Infrastructure Services Assemble Methods, Materials Experiment ObserveSimulate Analyse Results Quality Assessment Track and Credit Disseminate Deposit & Licence Marketplace Services Share Results Manage Results Releasing Portable Reproducible Objects Science 2.0 Repositories:Time for a Change in Scholarly Communication Assante, Candela,Castelli, Manghi, Pagano DOI: 10.1045/january2015-assante Mesirov,J. Accessible Reproducible Research Science 327(5964), 415-416 (2010) Supporting researchers to make & exchange FAIR content as they go… Credit for all products Value quality Data + the Methods
  • 65. Packaging: data + methods + models Scharm M,Wendland F, Peters M,Wolfien M,TheileT,Waltemath D SEMS, University of Rostock zip-like file with a manifest & metadata - Bundling files - Keeping provenance - Exchanging data - Shipping results Bergmann, F.T.,Adams, R., Moodie, S., Cooper, J., Glont, M., Golebiewski, M., ... & Olivier, B. G. (2014). COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project. BMC bioinformatics,15(1), 1. Combine Archive https://sems.unirostock.de/projects/combinearchive/
  • 66. The Cinderella of RDM: Standard Operating Procedures Record your processing steps
  • 67. Research Object Bundling Provenance Dependencies Versions Checklists Variance Portability Transparent Processes
  • 68. Precision medicine NGS pipelines Alterovitz, Dean, Goble, Crusoe, Soiland-Reyes et al Enabling Precision Medicine via standard communication of NGS provenance, analysis, and results, biorxiv.org, 2017, https://doi.org/10.1101/191783 Assemble, share, and analyze large and complex multi-element datasets distributed across multiple locations, referenced because too big Secure large scale moving of patient data. Chard et al I'll take that to go: Big data bags and minimal identifiers for exchange of large, complex datasets, https://doi.org/10.1109/BigData.2016.7840618
  • 69. FAIR Exchange of Research Goods Governance Stewardship Credit Tracking Lifecycles Fixivity… Arxiv, my Lab myExperiment GitHub, Web Service myWebSite bioModels.org, openModeller PubMed Spreadsheet in figshare ArrayExpress, BioSamples, PRIDE, GBIF, my Lab, institutional repository Overlaying the Research Commons Ecosystem
  • 70. Tracking, credit mining, comparison, auto- metadata, blockchain, boundary objects…. 1 3 2 A FAIR KnowledgeWeb of Research Objects Map across metadata Threaded publications Navigate, Pivot-Focus, Cite Self-describing
  • 72. Releasing Research: “within during” Analogous to software products & practices rather than articles An “evolving manuscript” would begin with a pre- publication, pre-peer review “beta 0.9” version of an article, followed by the approved published article itself, [ … ] “version 1.0”. Subsequently, scientists would update this paper with details of further work as the area of research develops. Versions 2.0 and 3.0 might allow for the “accretion of confirmation [and] reputation”. Ottoline Leyser […] assessment criteria in science revolve around the individual. “People have stopped thinking about the scientific enterprise”. http://www.timeshighereducation.co.uk/news/evolving-manuscripts-the-future-of-scientific-communication/2020200.article Demands different ideas of credit and citation
  • 73. Living Entry Published Snapshot Entry FAIRDOM Commons Releasing…. G. Penkler, F. DuToit,W. Adams, M. Rautenbach, D. C. Palm, D. D.Van Niekerk, & J. L. Snoep. (2014). Glucose metabolism in Plasmodium falciparum trophozoites. FAIRDOMHub. http://doi.org/10.15490/seek.1.investigation.56
  • 74. Research Infrastructure Services Assemble Methods, Materials Experiment ObserveSimulate Analyse Results Quality Assessment Track and Credit Disseminate Deposit & Licence Marketplace Services Share Results Manage Results Releasing Portable Reproducible Objects Science 2.0 Repositories:Time for a Change in Scholarly Communication Assante, Candela,Castelli, Manghi, Pagano DOI: 10.1045/january2015-assante Mesirov,J. Accessible Reproducible Research Science 327(5964), 415-416 (2010) Supporting researchers to make & exchange FAIR content as they go… Credit for all products Value quality Data + the Methods
  • 75. FAIR Play: Walled Gardens Open science applies to you but not me … not available = not citable Jurgen Hannstra Vrije Universiteit, Amsterdam Using FAIRDOM my own lab colleagues saw what I was doing and called to collaborate! • Licenses • Negotiated access • Embargos • Permission controls • Staged sharing • Private spaces • enclave sharing • consortia pressures • within project mistrusts • patterns (models vs data) • hoarding & flirting • personal dowries • ex-member divorces • asymmetrical reciprocity • credit and citation • “on date” not “during” publishing
  • 76. FAIR Play: RDM Stewardship Value Systems • of assets, of reproducibility, of metadata • public vs personal good • economics of infrastructure • priorities • stewards and stewardship • credit & reward Sweatshops • competing • burden - time, skills • short term, shortcuts • untrained • leadership sets the tone The reward norms of science need to change Everyone know this. No-one knows how to fix it. All research products and all scholarly labour are equally valued (except by institutional promotion boards, funding panels, and review committees)
  • 77. Data Journals Data Citation Data Policies: Open Data by Default Credit & Citation Infrastructure (altmetrics based) Data Stewardship Careers
  • 78. Credit – giving and taking CreDiT Stop conflating credit with authorship Getting people to cite data Data Citation Metadata Landing Pages Persistent Identifiers Data citation mining https://project-thor.eu/ https://casrai.org/credit/ https://www.nature.com/articles/sdata201539 Making Data Count Linking Data to Literature https://www.project-freya.eu/
  • 79. Data Stewardship Career Recognition 500,000 needed in Europe Stewards – skilling and rewarding
  • 82. Stable & Sustained Infrastructure & Support FAIR ≠ FREE Countless expectations to do RDM Much less in how to sustain the archives, infrastructure and the skills needed “we want FAIR data but we will only support research” Complexity of funding federated commons with project-based national funds Funding models need an update!
  • 84. Why FAIR isn’t FREE…..
  • 85. data managers librarians Global Enterprises Standards, International Research Infrastructures FAIR Research Commons
  • 86. A Bigger RDM Picture Fragmentation Federation Ecosystem Embed in working practice Born FAIR Ramps First & Last Mile Egosystem Stakeholders Research Objects Stewardship Professionalisation Cultural norms Interoperability FAIR is not FREE Releasing Credit, reward
  • 87. What can you do? Five steps to better data better research Get expert help and give stewards credit Train yourTeam incl. your PI Publish your Data and credit others Develop a DMP and resource it Annotate for strangers Create analysis-friendly data Record your processing steps Use a unique identifier for each record Use standards Save and backup raw data Submit to a repository. Get a DOI Try to use platforms and tools that work together
  • 89. Acknowledgements • David De Roure • Tim Clark • Sean Bechhofer • Robert Stevens • Christine Borgman • Victoria Stodden • Marco Roos • Jose Enrique Ruiz del Mazo • Oscar Corcho • Ian Cottam • Steve Pettifer • Magnus Rattray • Chris Evelo • Katy Wolstencroft • Robin Williams • Pinar Alper • C. Titus Brown • Greg Wilson • Kristian Garza • Matthew Dovey • Nick Juty • Helen Parkinson • Juliana Freire • Jill Mesirov • Simon Cockell • Paolo Missier • Paul Watson • Gerhard Klimeck • Matthias Obst • Jun Zhao • Pinar Alper • Daniel Garijo • Yolanda Gil • James Taylor • Alex Pico • Sean Eddy • Cameron Neylon • Barend Mons • Kristina Hettne • Stian Soiland-Reyes • Rebecca Lawrence • Michael Crusoe • Raphael Jimenez • Alasdair Gray
  • 90. Jon OlavVik, Norwegian University of Life Science Maksim Zakhartsev University Hohenheim, Stuttgart, Germany Alexey Kolodkin Siberian Branch Russian Academy of Sciences Tomasz Zieliński, SynthSys Centre University Edinburgh, UK Martin Peters, Martin Scharm Systems Biology Bioinformatics University of Rostock, Germany Hadas Leonov
  • 91. EXTRA
  • 92. From: EOSC Stakeholder Forum, Brussels 28-29 November 2017 Soap-box session: Intermediaries, Research communities & Libraries, Valentino Cavalli