SlideShare a Scribd company logo
1 of 63
Research Objects:
why, what and how
ProfessorCarole Goble CBE FREng FBCS
The University of Manchester, UK
The Software Sustainability Institute, UK
carole.goble@manchester.ac.uk
researchobject.org
Metadata and Semantic Research Conference 2015, 9-11 Sept 2015, Manchester, UK
Prologue
e-Lab Collabs.
& Shared Asset
Repositories
Knowledge,
Metadata, Linked
Data, Ontologies
Software
Engineering for
Scientists
Computational
Workflow Systems
Reproducibility
Micro
Publications
Open Science
Research
Objects
Linked Data for
Science
Scholarly
Comms
Prologue
Biodiversity
Systems Biology
Synthetic Biology
Astronomy
Helio
Physics
Genomics
Public Health
Epidemiology
Digital
Preservation
Social
Science
Pharmacology
Knowledge Turning, Info Flow
Barriers to Cure
• Access to scientific
resources
• Coordination and
Collaboration
• Flow of Information
http://fora.tv/2010/04/23/Sage_Commons_Josh_Sommer_Chordoma_Foundation
[Pettifer, Attwood]
http://getutopia.com
Virtual Witnessing*
Scientific publications:
• announce a result
• convince readers the result is
correct
“papers in experimental [and
computational science] should
describe the results and provide
a clear enough protocol
[algorithm] to allow successful
repetition and extension”
Jill Mesirov, Broad Institute, 2010**
**Accessible Reproducible Research, Science 22 January 2010, Vol. 327 no. 5964 pp. 415-416, DOI: 10.1126/science.1179653
*Leviathan and the Air-Pump: Hobbes, Boyle, and the Experimental Life (1985) Shapin and Schaffer.
Bramhall et al QUALITY OF METHODS REPORTING IN ANIMAL MODELS OF
COLITIS Inflammatory Bowel Diseases, , 2015
“Only one of the 58 papers reported all essential
criteria on our checklist. Animal age, gender, housing
conditions and mortality/morbidity were all poorly
reported…..”
50papers randomly chosen from 378 manuscripts in
2011 that use BurrowsWheeler Aligner for mapping
Illumina reads
31 no s/w version,
parameters, exact version of
genomic reference sequence
26no access to
primary data sets
Nekrutenko & Taylor, Next-generation sequencing data interpretation:
enhancing, reproducibility and accessibility, Nature Genetics 13 (2012)
“I can’t immediately reproduce the research in my own laboratory.
It took an estimated 280 hours for an average user to approximately
reproduce the paper.”
Prof Phil Bourne
Associate Director, NIH Big Data 2 Knowledge Program
“An article about
computational science in a
scientific publication is not
the scholarship itself, it is
merely advertising of the
scholarship. The actual
scholarship is the complete
software development
environment, [the complete
data] and the complete set
of instructions which
generated the figures.”
David Donoho, “Wavelab and
Reproducible Research,” 1995
From Manuscripts to “Research Objects”
Multi-various, citable research products/assets
From manuscripts to “Research Objects”
From manuscripts to “Research Objects”
Pre-packaged Docker images containing a
bioinformatics tool and
standardised interface through which data
and parameters are passed.
http://bioboxes.org
FAIR Research, crossing silos
From Manuscripts to “Research Objects”
Datasets, Data collections
Standard operating procedures
Software, algorithms
Configurations,
Tools and apps, services
Codes, code libraries
Workflows, scripts
System software
Infrastructure
Compilers, hardware
Fragmentation
FAIR RO Distributed Commons
NIH BD2K, EU FAIRPorts…. Pooled Resources
NIH BD2K Commons
and Research Objects
https://datascience.nih.gov/commons
Why Research Objects?
• Computational Workflows / Scripts
– Multi-step, nested.
– Data, executable codes (remote and local),
libraries
– Preservation, Repair
– Reproducibility
• Systems Biology
– Models, data (construction, validation,
predicted), SOPs, samples, articles
– Structured Investigations, Studies, Assays
– Exchange
– Reproducibility
Why Research Objects?
• Computational Workflows / Scripts
– Multi-step, nested.
– Data, executable codes (remote and local),
libraries
– Preservation, Repair
– Reproducibility
• Systems Biology
– Models, data (construction, validation,
predicted), SOPs, samples, articles
– Structured Investigations, Studies, Assays
– Exchange
– Reproducibility
Commons
Commons
myexperiment.org
fair-dom.org
"Mapping present and future predicted distribution patterns for a
meso-grazer guild in the Baltic Sea" by Sonja Leidenberger et al
Workflow Commons
Instruments, Materials, Method
Data Scopes
Input Data
Software
Output Data
Config
Parameters
Methods
techniques, algorithms,
spec. of the steps
Materials
datasets, parameters,
algorithm seeds
Experiment
Instruments
codes, services, scripts,
underlying libraries
Laboratory
sw and hw infrastructure,
systems software,
integrative platforms
Setup
Drummond, Replicability is not Reproducibility: Nor is it Good Science, online
Peng, Reproducible Research in Computational Science Science 2 Dec 2011: 1226-1227.
Instruments, Materials, Method
Read. Run. Remake
Science changes,
experiments & results vary,
So do labs.
Instruments break,
labs decay.
Zhao, et al . Why workflows break - Understanding and combating decay in
Taverna workflows, 8th Intl Conf e-Science 2012
http://atyourservice.blogs.xerox.com/files/2011/09/cloning-results-may-vary.jpg
Reproducibility: working. reporting
submit article
and move on…
publish article
Research
Environment
Publication
Environment
Peer
Review
FAIR Reproducibility
Find, Access, Interoperate, Reuse
https://doi.org/10.15490/seek.1.investigation.56
FAIRDOM Metadata framework
link studies, link assets, map content to.
Common
elements and
relationships
between things
produced and
used in
experiments.
Common
elements
Specific
elements for
specific data
types.
Just Enough
Results Model
http://seek4science.org/JERMOntologyhttp://isatab.sourceforge.net/format.html
Penkler et al (2015) FEBSJ 282:1481-1511
https://dx.doi.org/10.1111/febs.13237
Consumers
Producers
Project
Repositories
harvesting
link
Standards
organise
validate
Native Commons
Repositories
Why Research Objects?
Compound, nested, scattered, yet interconnected
COMMONS
Why Research Objects?
Preserved, portable research products. Snapshots.
inter-platform exchange, reproducibility
Commons
New
Discovery
Cross-Institutional e-Lab fragmentation
parts scattered across subject specific/general resources
101 Innovations in Scholarly Communication - the Changing ResearchWorkflow, Boseman and Kramer, 2015,
http://figshare.com/articles/101_Innovations_in_Scholarly_Communication_the_Changing_Research_Workflow/1286826
Why Research Objects?
Active research products, snaphots
• Fork.
• Merge.
• Version.
• Cite
• Snapshot.
• Live.
[Martin Scharm]
Haus et al, BMC Systems Biology, 2011, 5:10
Solvent production by Clostridium acetobutylicum
F1000Research Living Figures
versioned articles, in-article data manipulation
R Lawrence Force2015, Vision Award Runner Up
http://f1000.com/posters/browse/summary/1097482
Simply data + code
Can change the definition of
a figure, and ultimately the
journal article
Colomb J and Brembs B.
Sub-strains of Drosophila Canton-S differ
markedly in their locomotor behavior [v1;
ref status: indexed, http://f1000r.es/3is]
F1000Research 2014, 3:176
Other labs can replicate the study, or
contribute their data to a meta-
analysis or disease model - figure
automatically updates.
Data updates time-stamped.
New conclusions added via versions.
Publish, Release (like Software)
11/09/2015 34
An “evolving manuscript” would begin with a
pre-publication, pre-peer review “beta 0.9”
version of an article, followed by the approved
published article itself, [ … ] “version 1.0”.
Subsequently, scientists would update this
paper with details of further work as the area
of research develops. Versions 2.0 and 3.0
might allow for the “accretion of confirmation
[and] reputation”.
Ottoline Leyser […] assessment criteria in
science revolve around the individual. “People
have stopped thinking about the scientific
enterprise”.
http://www.timeshighereducation.co.uk/news/evolving-manuscripts-the-future-of-scientific-communication/2020200.article
Jennifer Schopf,Treating Data Like Software: A Case for Production Quality Data,JCDL 2012
Software-like Release paradigm
• Agile
development
methods
• Free Open
Source
Software
methods
https://tctechcrunch2011.files.wordpress.com/2011/05/tcdisrupt_tc-9.jpg
Knowledge
Turning
interpret
Commons
FAIR
Research
Products
Reproducibility
Interpretation
Comparison
Preservation
Portability
Release
Active
Research
Research Objectmeans
ends
drivers
Why Summary Framework
Goble, De Roure, Bechhofer, Accelerating KnowledgeTurns, DOI: 10.1007/978-3-642-37186-8_1
Multi-various products, platforms, resources.
First class citizens - id, manage, credit, track, profile, focus
A Framework to Bundle, Port and Link (scattered) resources, related
experiments. Metadata Objects that carry Research Context. Units of exchange.
Bechhofer,Why linked data is not enough for scientists,
DOI: 10.1016/j.future.2011.08.004
Metadata Objects
Evolving
multi –typed, stewarded, sited, authored
span research, researchers, platforms, time
Contributions.
Content.
closed <-> open
local <-> alienembed <-> refer
Stewardship. Citation.
Bigger on the inside than the
outside, Content maybe
logically or physically inside
TARDIS:Time and Relative
Dimension in Space
Scholarship
https://meditationsfromzion.files.wordpress.com/2013/05/tardis.jpg
What and How Framework
Manifest
Core model
using
standards
Annotation
profiles
progressive
extensions
Implement-
ation
Profiles
using legacy
& commodity
platforms
Policies
Tools
Lifecycle
Steward
Ship Training
Principles & Conventions
API specificationMetadata formats
Technology Independent.
The least possible.
The simplest feasible.
Low tech.
Graceful degradation.
The Research Object Desiderata
Manifests and Containers
Container
Packaging:
Zip files, Docker images, BagIt, …
Catalogues & Commons Platforms:
FAIRDOM SEEK, Farr CommonsCKAN,
STELAR eLab, myExperiment
Manifest
Metadata
Describes the aggregated resources, their
annotations and their provenance
Manifest
Manifest Metadata
Manifest Construction
• Identification – id, title, creator, status….
• Aggregates – list of ids/links to resources
• Annotations – list of annotations about resources
Manifest
Manifest Description
• Checklists – what should be there
• Provenance – where it came from
• Versioning – its evolution
• Dependencies – what else is needed
Manifest
Manifest Construction
Unique identifiers as
names for things.
doi, epic, orcid, purl, RII,
Identifiers.org
Mechanism of
aggregation to group
things together.
OAI-ORE
Metadata about those
things & how they relate
to each other.
W3C OADM
http://w3id.org/ro/
FAIR Manifest Descriptions: Types of RO
Progressive Annotation Profiles
Checklist
Provenance
Versioning
Dependencies http://www.cnri.reston.va.us/papers/OverviewDi
gitalObjectArchitecture.pdf
NISO-JATS
Dublin Core
EFO JERM
SBML wfdesc
Checklists aka Reporting Guidelines
Consistent Reporting, Standardised Cataloguing, Validation
Gamble, Goble, Klyne, Zhao
MIM:A Minimum Information Model vocabulary and
framework for Scientific Linked Data,
IEEE 8th Intl Conf on eScience , 2012
MeanWhealDiameter reports:
must include values for the
properties: SubjectId,
SptSolution, Date, FollowUp
should include values for the
properties:VariableLabel
Implementation Profiles
Research Object Bundle Specification
Manifest
https://w3id.org/bundle/ doi:10.5281/zenodo.10440
Container
Packaging:
Zip files, Docker images, BagIt, …
Catalogues & Commons Platforms:
FAIRDOM SEEK, Farr CommonsCKAN,
STELAR eLab, myExperiment
RO Unzip
• Reproducibility
• Versioning
• Systematic and
extensible meta-
data collection
• Cross platform
exchange
• Publishing
Living Snapshot
Sys and Syn Bio Experiments
management and publishing
Examples
Sys & Syn Biology
Community Standards
Bergmann, Rodriguez, Le Novère.
COMBINE archive specification.
<http://identifiers.org/combine.specifications/o
mex.version-1> (2014)
Bergman et al COMBINE archive and OMEX
format: one file to share all information to
reproduce a modeling project, BMC
Bioinformatics 2014, 15:369
Combine with RO.
Standardised metadata & API
http://co.mbine.org/documents/archive
https://github.com/stain/ro-combine-archive doi:10.5281/zenodo.10439
Martin Scharm
Universität Rostock
ATLAS Collider
Data Analytics
Portable, lightweight
application runtime
and packaging tool.
Image
ATLAS and CMS detector data
CharlesVardeman, Da Huo
University of Notre Dame
All data and files
of the execution
+ Instructions
convert
bundle
manifest
Relate files
and layers
Add provenance
and annotations
Link in other
content
Exchange
Reproducibility
Same data
Same code
Same run time
environment
Systematic and
extensible metadata
collection
Computational Workflow Runs
workflowrun.prov.ttl
(RDF)
outputA.txt
outputC.jpg
outputB/
intermediates/
1.txt
2.txt
3.txt
de/def2e58b-50e2-4949-9980-fd310166621a.txt
inputA.txt
workflow attribution
execution
environment
Aggregating in Research Object
ZIP folder structure (RO Bundle)
mimetype
application/vnd.wf4ever.robundle+zip
.ro/manifest.jso
n
URI
references
Exchange
Reproducibility
Same data
Same code
Systematic and
extensible meta-
data collection
Workflow
Annotation Profile
Wf4Ever
Project
STELAR Asthma
Research e-Lab
STELAR e-Lab
Requests for data
Data Exports
Comments, questions
ALSPAC
MAAS
SEATON
Ashford
On-going data
collection
STELAR Researchers
Isle of
Wight
Data Collection
Methods and Results
STELARTeam
Farr Institute@Manchester
Farr Institute Commons
catalogues over safe havens
Exchange
Systematic and
extensible meta-data
collection
NIH BD2K Commons
and Research Objects
Metadata Profiles
RO Model API
Community IDs
RO Model Manifest Profile
Implementation Profiles
https://datascience.nih.gov/commons
Many
Challenges
Many outstanding issues…
Social & Cultural Technical
Tragedy of the Commons
https://doctorwhothing.files.wordpress.com/2014/01/doctor-who-
fan-girl-group.jpg
me
ME
my team
close
colleagues
peers
Personal productivity
Retention & Reuse
Publish driven
Public Good
Sharing & Reproducibility
Access driven
[Apologies to Resnick and Malone]
FAIR Reward. Reducing Pain.
Cost vs Benefit.
RO Ramps. Born RO.
Commodity Tooling, Libraries, Lightweight
Making and Auto-making
Manifest Descriptions
Making
Containers
Literate Programming,
electronic lab notebooks
Rendering &
Using Manifests
FAIR Citation, credit, tracking
• Citation
– Resolution and semantics
• Tamper-proof currency
– Blockchain, Ethereum
• RO trajectories
– Data trajectories [Missier]
– Provenance propagation
• Credit trajectories
– Micro-credit tracking
• Social-political acceptance
– All research products valued
– FAIR publishing effort recognised
• Defend it (snapshot)
• Locate it (most recent)
• Reuse it (a version, a component)
• Credit it (contributory authorship)
• Cross link it (connections)
Knowledge Turning with Ros
Simple approach, towards transparent FAIR principles
https://d2t1xqejof9utc.cloudfront.net/screenshots/pics/1ddf584eb4cf6b12
83baf9aa6d380cff/original.jpg
Inspired by Bob Harrison
• Incremental shift for
infrastructure providers.
• Moderate shift for policy
makers and stewards.
• Paradigm shift for
researchers, their
institutions and
publishers.
Knowledge Turning with ROs
All the members of the Wf4Ever team
Colleagues in Manchester’s Information
Management Group
http://www.researchobject.org
http://www.wf4ever-project.org
http://www.fair-dom.org
http://seek4science.org
http://rightfield.org.uk
http://www.software.ac.uk
http://www.datafairport.org
Alan Williams
Jo McEntyre
Norman Morrison
Stian Soiland-Reyes
Paul Groth
Tim Clark
Juliana Freire
Alejandra Gonzalez-Beltran
Philippe Rocca-Serra
Ian Cottam
Susanna Sansone
Kristian Garza
Barend Mons
Sean Bechhofer
Philip Bourne
Matthew Gamble
Raul Palma
Jun Zhao
Neil Chue Hong
Josh Sommer
Matthias Obst
Jacky Snoep
David Gavaghan
Rebecca Lawrence
Stuart Owen
Finn Bacall

More Related Content

What's hot

Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
Carole Goble
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
Carole Goble
 
FAIR Data and Model Management for Systems Biology (and SOPs too!)
FAIR Data and Model Management for Systems Biology(and SOPs too!)FAIR Data and Model Management for Systems Biology(and SOPs too!)
FAIR Data and Model Management for Systems Biology (and SOPs too!)
Carole Goble
 

What's hot (20)

Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
FAIRy Stories
FAIRy StoriesFAIRy Stories
FAIRy Stories
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
 
FAIR data and model management for systems biology.
FAIR data and model management for systems biology.FAIR data and model management for systems biology.
FAIR data and model management for systems biology.
 
ROHub
ROHubROHub
ROHub
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects help
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
 
Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...
 
The FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems BiologyThe FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems Biology
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
 
FAIR Data and Model Management for Systems Biology (and SOPs too!)
FAIR Data and Model Management for Systems Biology(and SOPs too!)FAIR Data and Model Management for Systems Biology(and SOPs too!)
FAIR Data and Model Management for Systems Biology (and SOPs too!)
 
Making your data good enough for sharing.
Making your data good enough for sharing.Making your data good enough for sharing.
Making your data good enough for sharing.
 

Similar to Mtsr2015 goble-keynote

Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
dgarijo
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
William Gunn
 

Similar to Mtsr2015 goble-keynote (20)

Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
 
Peer Review and Science2.0
Peer Review and Science2.0Peer Review and Science2.0
Peer Review and Science2.0
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Scott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data PublishingScott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data Publishing
 
One Scientist’s Wish List for Scientific Publishers
One Scientist’s Wish List for Scientific PublishersOne Scientist’s Wish List for Scientific Publishers
One Scientist’s Wish List for Scientific Publishers
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decade
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
 
Reproducible Research and the Cloud
Reproducible Research and the CloudReproducible Research and the Cloud
Reproducible Research and the Cloud
 

More from Carole Goble

RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
Carole Goble
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Carole Goble
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
Carole Goble
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
Carole Goble
 

More from Carole Goble (20)

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
 
RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a Village
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learning
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can help
 

Recently uploaded

Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
Sérgio Sacani
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
Sérgio Sacani
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Sérgio Sacani
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
University of Hertfordshire
 
Tuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notesTuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notes
jyothisaisri
 

Recently uploaded (20)

WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 RpWASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
WASP-69b’s Escaping Envelope Is Confined to a Tail Extending at Least 7 Rp
 
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
 
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptxPlasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
Plasmapheresis - Dr. E. Muralinath - Kalyan . C.pptx
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
 
EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdf
 
B lymphocytes, Receptors, Maturation and Activation
B lymphocytes, Receptors, Maturation and ActivationB lymphocytes, Receptors, Maturation and Activation
B lymphocytes, Receptors, Maturation and Activation
 
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxBiochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
 
NuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdfNuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdf
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
 
Tuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notesTuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notes
 
Erythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C KalyanErythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C Kalyan
 
GBSN - Microbiology Lab (Microbiology Lab Safety Procedures)
GBSN -  Microbiology Lab (Microbiology Lab Safety Procedures)GBSN -  Microbiology Lab (Microbiology Lab Safety Procedures)
GBSN - Microbiology Lab (Microbiology Lab Safety Procedures)
 
TEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdfTEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdf
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interaction
 
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
 
GBSN - Microbiology Lab (Compound Microscope)
GBSN - Microbiology Lab (Compound Microscope)GBSN - Microbiology Lab (Compound Microscope)
GBSN - Microbiology Lab (Compound Microscope)
 

Mtsr2015 goble-keynote

  • 1. Research Objects: why, what and how ProfessorCarole Goble CBE FREng FBCS The University of Manchester, UK The Software Sustainability Institute, UK carole.goble@manchester.ac.uk researchobject.org Metadata and Semantic Research Conference 2015, 9-11 Sept 2015, Manchester, UK
  • 2. Prologue e-Lab Collabs. & Shared Asset Repositories Knowledge, Metadata, Linked Data, Ontologies Software Engineering for Scientists Computational Workflow Systems Reproducibility Micro Publications Open Science Research Objects Linked Data for Science Scholarly Comms
  • 3. Prologue Biodiversity Systems Biology Synthetic Biology Astronomy Helio Physics Genomics Public Health Epidemiology Digital Preservation Social Science Pharmacology
  • 4. Knowledge Turning, Info Flow Barriers to Cure • Access to scientific resources • Coordination and Collaboration • Flow of Information http://fora.tv/2010/04/23/Sage_Commons_Josh_Sommer_Chordoma_Foundation
  • 6.
  • 7. Virtual Witnessing* Scientific publications: • announce a result • convince readers the result is correct “papers in experimental [and computational science] should describe the results and provide a clear enough protocol [algorithm] to allow successful repetition and extension” Jill Mesirov, Broad Institute, 2010** **Accessible Reproducible Research, Science 22 January 2010, Vol. 327 no. 5964 pp. 415-416, DOI: 10.1126/science.1179653 *Leviathan and the Air-Pump: Hobbes, Boyle, and the Experimental Life (1985) Shapin and Schaffer.
  • 8. Bramhall et al QUALITY OF METHODS REPORTING IN ANIMAL MODELS OF COLITIS Inflammatory Bowel Diseases, , 2015 “Only one of the 58 papers reported all essential criteria on our checklist. Animal age, gender, housing conditions and mortality/morbidity were all poorly reported…..” 50papers randomly chosen from 378 manuscripts in 2011 that use BurrowsWheeler Aligner for mapping Illumina reads 31 no s/w version, parameters, exact version of genomic reference sequence 26no access to primary data sets Nekrutenko & Taylor, Next-generation sequencing data interpretation: enhancing, reproducibility and accessibility, Nature Genetics 13 (2012)
  • 9. “I can’t immediately reproduce the research in my own laboratory. It took an estimated 280 hours for an average user to approximately reproduce the paper.” Prof Phil Bourne Associate Director, NIH Big Data 2 Knowledge Program
  • 10. “An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment, [the complete data] and the complete set of instructions which generated the figures.” David Donoho, “Wavelab and Reproducible Research,” 1995
  • 11. From Manuscripts to “Research Objects” Multi-various, citable research products/assets
  • 12. From manuscripts to “Research Objects”
  • 13. From manuscripts to “Research Objects” Pre-packaged Docker images containing a bioinformatics tool and standardised interface through which data and parameters are passed. http://bioboxes.org
  • 14. FAIR Research, crossing silos From Manuscripts to “Research Objects” Datasets, Data collections Standard operating procedures Software, algorithms Configurations, Tools and apps, services Codes, code libraries Workflows, scripts System software Infrastructure Compilers, hardware Fragmentation
  • 15. FAIR RO Distributed Commons NIH BD2K, EU FAIRPorts…. Pooled Resources
  • 16. NIH BD2K Commons and Research Objects https://datascience.nih.gov/commons
  • 17. Why Research Objects? • Computational Workflows / Scripts – Multi-step, nested. – Data, executable codes (remote and local), libraries – Preservation, Repair – Reproducibility • Systems Biology – Models, data (construction, validation, predicted), SOPs, samples, articles – Structured Investigations, Studies, Assays – Exchange – Reproducibility
  • 18. Why Research Objects? • Computational Workflows / Scripts – Multi-step, nested. – Data, executable codes (remote and local), libraries – Preservation, Repair – Reproducibility • Systems Biology – Models, data (construction, validation, predicted), SOPs, samples, articles – Structured Investigations, Studies, Assays – Exchange – Reproducibility Commons Commons myexperiment.org fair-dom.org
  • 19.
  • 20. "Mapping present and future predicted distribution patterns for a meso-grazer guild in the Baltic Sea" by Sonja Leidenberger et al Workflow Commons
  • 21. Instruments, Materials, Method Data Scopes Input Data Software Output Data Config Parameters Methods techniques, algorithms, spec. of the steps Materials datasets, parameters, algorithm seeds Experiment Instruments codes, services, scripts, underlying libraries Laboratory sw and hw infrastructure, systems software, integrative platforms Setup Drummond, Replicability is not Reproducibility: Nor is it Good Science, online Peng, Reproducible Research in Computational Science Science 2 Dec 2011: 1226-1227.
  • 22. Instruments, Materials, Method Read. Run. Remake Science changes, experiments & results vary, So do labs. Instruments break, labs decay. Zhao, et al . Why workflows break - Understanding and combating decay in Taverna workflows, 8th Intl Conf e-Science 2012 http://atyourservice.blogs.xerox.com/files/2011/09/cloning-results-may-vary.jpg
  • 23. Reproducibility: working. reporting submit article and move on… publish article Research Environment Publication Environment Peer Review
  • 24. FAIR Reproducibility Find, Access, Interoperate, Reuse
  • 26. FAIRDOM Metadata framework link studies, link assets, map content to. Common elements and relationships between things produced and used in experiments. Common elements Specific elements for specific data types. Just Enough Results Model http://seek4science.org/JERMOntologyhttp://isatab.sourceforge.net/format.html
  • 27. Penkler et al (2015) FEBSJ 282:1481-1511 https://dx.doi.org/10.1111/febs.13237
  • 29. Why Research Objects? Preserved, portable research products. Snapshots. inter-platform exchange, reproducibility Commons New Discovery
  • 30. Cross-Institutional e-Lab fragmentation parts scattered across subject specific/general resources 101 Innovations in Scholarly Communication - the Changing ResearchWorkflow, Boseman and Kramer, 2015, http://figshare.com/articles/101_Innovations_in_Scholarly_Communication_the_Changing_Research_Workflow/1286826
  • 31. Why Research Objects? Active research products, snaphots • Fork. • Merge. • Version. • Cite • Snapshot. • Live. [Martin Scharm] Haus et al, BMC Systems Biology, 2011, 5:10 Solvent production by Clostridium acetobutylicum
  • 32. F1000Research Living Figures versioned articles, in-article data manipulation R Lawrence Force2015, Vision Award Runner Up http://f1000.com/posters/browse/summary/1097482 Simply data + code Can change the definition of a figure, and ultimately the journal article Colomb J and Brembs B. Sub-strains of Drosophila Canton-S differ markedly in their locomotor behavior [v1; ref status: indexed, http://f1000r.es/3is] F1000Research 2014, 3:176 Other labs can replicate the study, or contribute their data to a meta- analysis or disease model - figure automatically updates. Data updates time-stamped. New conclusions added via versions.
  • 33. Publish, Release (like Software) 11/09/2015 34 An “evolving manuscript” would begin with a pre-publication, pre-peer review “beta 0.9” version of an article, followed by the approved published article itself, [ … ] “version 1.0”. Subsequently, scientists would update this paper with details of further work as the area of research develops. Versions 2.0 and 3.0 might allow for the “accretion of confirmation [and] reputation”. Ottoline Leyser […] assessment criteria in science revolve around the individual. “People have stopped thinking about the scientific enterprise”. http://www.timeshighereducation.co.uk/news/evolving-manuscripts-the-future-of-scientific-communication/2020200.article
  • 34. Jennifer Schopf,Treating Data Like Software: A Case for Production Quality Data,JCDL 2012 Software-like Release paradigm • Agile development methods • Free Open Source Software methods https://tctechcrunch2011.files.wordpress.com/2011/05/tcdisrupt_tc-9.jpg
  • 36. Multi-various products, platforms, resources. First class citizens - id, manage, credit, track, profile, focus A Framework to Bundle, Port and Link (scattered) resources, related experiments. Metadata Objects that carry Research Context. Units of exchange. Bechhofer,Why linked data is not enough for scientists, DOI: 10.1016/j.future.2011.08.004
  • 37. Metadata Objects Evolving multi –typed, stewarded, sited, authored span research, researchers, platforms, time Contributions. Content. closed <-> open local <-> alienembed <-> refer Stewardship. Citation. Bigger on the inside than the outside, Content maybe logically or physically inside TARDIS:Time and Relative Dimension in Space Scholarship https://meditationsfromzion.files.wordpress.com/2013/05/tardis.jpg
  • 38. What and How Framework Manifest Core model using standards Annotation profiles progressive extensions Implement- ation Profiles using legacy & commodity platforms Policies Tools Lifecycle Steward Ship Training Principles & Conventions API specificationMetadata formats
  • 39. Technology Independent. The least possible. The simplest feasible. Low tech. Graceful degradation. The Research Object Desiderata
  • 40. Manifests and Containers Container Packaging: Zip files, Docker images, BagIt, … Catalogues & Commons Platforms: FAIRDOM SEEK, Farr CommonsCKAN, STELAR eLab, myExperiment Manifest Metadata Describes the aggregated resources, their annotations and their provenance Manifest
  • 41. Manifest Metadata Manifest Construction • Identification – id, title, creator, status…. • Aggregates – list of ids/links to resources • Annotations – list of annotations about resources Manifest Manifest Description • Checklists – what should be there • Provenance – where it came from • Versioning – its evolution • Dependencies – what else is needed Manifest
  • 42. Manifest Construction Unique identifiers as names for things. doi, epic, orcid, purl, RII, Identifiers.org Mechanism of aggregation to group things together. OAI-ORE Metadata about those things & how they relate to each other. W3C OADM http://w3id.org/ro/
  • 43. FAIR Manifest Descriptions: Types of RO Progressive Annotation Profiles Checklist Provenance Versioning Dependencies http://www.cnri.reston.va.us/papers/OverviewDi gitalObjectArchitecture.pdf NISO-JATS Dublin Core EFO JERM SBML wfdesc
  • 44. Checklists aka Reporting Guidelines Consistent Reporting, Standardised Cataloguing, Validation Gamble, Goble, Klyne, Zhao MIM:A Minimum Information Model vocabulary and framework for Scientific Linked Data, IEEE 8th Intl Conf on eScience , 2012 MeanWhealDiameter reports: must include values for the properties: SubjectId, SptSolution, Date, FollowUp should include values for the properties:VariableLabel
  • 45. Implementation Profiles Research Object Bundle Specification Manifest https://w3id.org/bundle/ doi:10.5281/zenodo.10440 Container Packaging: Zip files, Docker images, BagIt, … Catalogues & Commons Platforms: FAIRDOM SEEK, Farr CommonsCKAN, STELAR eLab, myExperiment
  • 46.
  • 47. RO Unzip • Reproducibility • Versioning • Systematic and extensible meta- data collection • Cross platform exchange • Publishing Living Snapshot Sys and Syn Bio Experiments management and publishing
  • 49. Sys & Syn Biology Community Standards Bergmann, Rodriguez, Le Novère. COMBINE archive specification. <http://identifiers.org/combine.specifications/o mex.version-1> (2014) Bergman et al COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project, BMC Bioinformatics 2014, 15:369 Combine with RO. Standardised metadata & API http://co.mbine.org/documents/archive https://github.com/stain/ro-combine-archive doi:10.5281/zenodo.10439 Martin Scharm Universität Rostock
  • 50. ATLAS Collider Data Analytics Portable, lightweight application runtime and packaging tool. Image ATLAS and CMS detector data CharlesVardeman, Da Huo University of Notre Dame All data and files of the execution + Instructions convert bundle manifest Relate files and layers Add provenance and annotations Link in other content Exchange Reproducibility Same data Same code Same run time environment Systematic and extensible metadata collection
  • 51. Computational Workflow Runs workflowrun.prov.ttl (RDF) outputA.txt outputC.jpg outputB/ intermediates/ 1.txt 2.txt 3.txt de/def2e58b-50e2-4949-9980-fd310166621a.txt inputA.txt workflow attribution execution environment Aggregating in Research Object ZIP folder structure (RO Bundle) mimetype application/vnd.wf4ever.robundle+zip .ro/manifest.jso n URI references Exchange Reproducibility Same data Same code Systematic and extensible meta- data collection Workflow Annotation Profile Wf4Ever Project
  • 52. STELAR Asthma Research e-Lab STELAR e-Lab Requests for data Data Exports Comments, questions ALSPAC MAAS SEATON Ashford On-going data collection STELAR Researchers Isle of Wight Data Collection Methods and Results STELARTeam Farr Institute@Manchester
  • 53. Farr Institute Commons catalogues over safe havens Exchange Systematic and extensible meta-data collection
  • 54. NIH BD2K Commons and Research Objects Metadata Profiles RO Model API Community IDs RO Model Manifest Profile Implementation Profiles https://datascience.nih.gov/commons
  • 56. Many outstanding issues… Social & Cultural Technical Tragedy of the Commons https://doctorwhothing.files.wordpress.com/2014/01/doctor-who- fan-girl-group.jpg
  • 57. me ME my team close colleagues peers Personal productivity Retention & Reuse Publish driven Public Good Sharing & Reproducibility Access driven [Apologies to Resnick and Malone]
  • 58. FAIR Reward. Reducing Pain. Cost vs Benefit.
  • 59. RO Ramps. Born RO. Commodity Tooling, Libraries, Lightweight Making and Auto-making Manifest Descriptions Making Containers Literate Programming, electronic lab notebooks Rendering & Using Manifests
  • 60. FAIR Citation, credit, tracking • Citation – Resolution and semantics • Tamper-proof currency – Blockchain, Ethereum • RO trajectories – Data trajectories [Missier] – Provenance propagation • Credit trajectories – Micro-credit tracking • Social-political acceptance – All research products valued – FAIR publishing effort recognised • Defend it (snapshot) • Locate it (most recent) • Reuse it (a version, a component) • Credit it (contributory authorship) • Cross link it (connections)
  • 61. Knowledge Turning with Ros Simple approach, towards transparent FAIR principles https://d2t1xqejof9utc.cloudfront.net/screenshots/pics/1ddf584eb4cf6b12 83baf9aa6d380cff/original.jpg
  • 62. Inspired by Bob Harrison • Incremental shift for infrastructure providers. • Moderate shift for policy makers and stewards. • Paradigm shift for researchers, their institutions and publishers. Knowledge Turning with ROs
  • 63. All the members of the Wf4Ever team Colleagues in Manchester’s Information Management Group http://www.researchobject.org http://www.wf4ever-project.org http://www.fair-dom.org http://seek4science.org http://rightfield.org.uk http://www.software.ac.uk http://www.datafairport.org Alan Williams Jo McEntyre Norman Morrison Stian Soiland-Reyes Paul Groth Tim Clark Juliana Freire Alejandra Gonzalez-Beltran Philippe Rocca-Serra Ian Cottam Susanna Sansone Kristian Garza Barend Mons Sean Bechhofer Philip Bourne Matthew Gamble Raul Palma Jun Zhao Neil Chue Hong Josh Sommer Matthias Obst Jacky Snoep David Gavaghan Rebecca Lawrence Stuart Owen Finn Bacall