SlideShare a Scribd company logo
From peer-reviewed to peer-reproduced:
a role for research objects in scholarly
publishing in the life
sciences
Alejandra González-Beltrán
Oxford e-Research Centre, University of Oxford
-ontology.org
Bioinformatics Open Source Conference (BOSC), Dublin, Ireland
July 10-11 2015
"AGBell Notebook" by Alexander Graham Bell. (d. 1922) -
page 40-41 of Alexander Graham Bell Family Papers in the Library of Congress' Manuscript Division.
Licensed under Public Domain via Wikimedia Commons
- http://commons.wikimedia.org/wiki/File:AGBell_Notebook.jpg#/media/File:AGBell_Notebook.jpg
http://petcaretips.net/bonding-rabbit-to-pets.html
Many things have been said about
the challenges of
science reproducibility
and how it can go wrong…
Difficulties when the description
of the experimental steps
is only available in
lab notebooks and scientific articles;
lack of data,
lack of software tools
required for analysis
Can data models and computational workflows help in
capturing the experimental processes and reproduce findings?
How?
experimental
description
(design & steps)
conclusions
computational
workflows
aggregation & workflow preservation
Can data models and computational workflows help in
capturing the experimental processes and reproduce findings?
How?
Can data models and computational workflows help in
capturing the experimental processes and reproduce findings?
How?
Can data models and computational workflows help in
capturing the experimental processes and reproduce findings?
How?
• open peer-review
• availability of
• data
• analysis scripts
• documentation
Evaluation of SOAPdenovo2 tool for the de novo assembly of genomes from small DNA segments
reads by next generation sequencing, implementing improvements over SOAPdenovo1 assembler.
pre-publication history
https://github.com/aquaskyline/SOAPdenovo2
http://sourceforge.net/projects/soapdenovo2/
Experimental Description
Experimental Description
EXCELERATE interoperability component
http://www.ncbi.nlm.nih.gov/books/NBK279831/
http://elixir-uk.org/interoperability-infrastructure
genome
assembly
algorithm
genome
size
Predictor Variables
(Factor Name, Factor Type)
The experimental plan - computational case
genome
assembly
algorithm
genome
size
SOAPdenovo2
SOAPdenovo1
ALL-PATHS-LG
bacterial genome
insect genome
human genome
Predictor Variables
(Factor Name, Factor Type)
The experimental plan - computational case
genome
assembly
algorithm
genome
size
SOAPdenovo2
SOAPdenovo1
ALL-PATHS-LG
bacterial genome
insect genome
human genome
bacterial genome
insect genome
human genome
bacterial genome
insect genome
human genome
Predictor Variables
(Factor Name, Factor Type)
3x3 factorial design
9 study groups
The experimental plan - computational case
genome
assembly
algorithm
genome
size
SOAPdenovo2
SOAPdenovo1
ALL-PATHS-LG
bacterial genome
insect genome
human genome
bacterial genome
insect genome
human genome
bacterial genome
insect genome
human genome
Predictor Variables
(Factor Name, Factor Type)
The experimental plan - computational case
S. aureus
R. sphaeroides
B. impatiens
Chinese Han genome
(orYH genome)
genome
assembly
algorithm
genome
size
SOAPdenovo2
SOAPdenovo1
ALL-PATHS-LG
bacterial genome
insect genome
human genome
bacterial genome
insect genome
human genome
bacterial genome
insect genome
human genome
Predictor Variables
(Factor Name, Factor Type)
The experimental plan - computational case
Response Variables
(with units)
genome coverage (%)
computation run time (h)
peak memory consumption (Gb)
contig N50 (kb or bp)
scaffold N50 (kb or bp)
number of errors
The experimental steps
Unambiguous identification of resources (e.g. record from public repositories); persistent identifiers
if available (ORCIDs, DOIs); we suggest a dedicated article section
Experimental workflows - identification of processes, their inputs and outputs
Experimental design: identify experimental goal, independent and response variables
The experimental steps
Unambiguous identification of resources (e.g. record from public repositories); persistent identifiers
if available (ORCIDs, DOIs); dedicated article section
Experimental workflows - identification of processes, their inputs and outputs
Experimental design: identify experimental goal, independent and response variables
Reproducing SOAPdenovo2 results
with Galaxy workflows
S. aureus pipeline
Reproducing SOAPdenovo2 results
with Galaxy workflows
S. aureus pipeline
2241 400
30
119.0 11 106 24 68
0
Reproducing SOAPdenovo2 results
with Galaxy workflows
Publishing findings as nanopublications
assertion
provenance
publication info
nanopublication A NP represents structured data along with its
provenance in a single publishable and citable entity
Publishing findings as nanopublications
assertion
provenance
publication info
nanopublication A NP represents structured data along with its
provenance in a single publishable and citable entity
Abstract & Conclusions
assertion provenance
Generation of nanopublications for all the results of the
response variables
NanoMaton
templates for nanopublications
Prevent priming; report all findings corresponding to the identified
response variables
Remain neutral and report all findings of similar
importance with the same weight
“genome coverage increased
over the human data when
comparing SOAPdenovo2
against SOAPdenovo1”
Link conclusions
to
experimental
description
http://www.researchobject.org/
Aggregation and workflow preservation as
ResearchObject: enables the aggregation of the digital
resources contributing to findings of computational
research, including results, data and software, as citable
compound digital objects
http://isa-tools.github.io/soapdenovo2
Aggregation and workflow preservation as
http://www.researchobject.org/
From narrative to self-described structured data
Model & workflow assisted experimental description and review process
Depth and breadth of semantic resources, clear meaning of experimental
elements
Ruibang Luo, University of Hong Kong
Tin-Lap Lee, Chinese University of Hong Kong
Tak-wah Lam, University of Hong Kong
SOAPdenovo2
Scott Edmunds, GigaScience
Peter Li, GigaScience
Marco Roos, Leiden University
Mark Thompson, Leiden University
Rajaram Kaliyaperumal, Leiden University
Eelke van der Horst, Leiden University
Jun Zhao, Lancaster University
María Susana Avila García,
Oxford University
Philippe Rocca-Serra, Oxford University
Susanna-Assunta Sansone, Oxford University
Alejandra Gonzalez-Beltran, Oxford University
Team
Questions?
You can email us...
isatools@googlegroups.com
View our blog
http://isatools.wordpress.com
Follow us onTwitter
@isatools
View our websites
View our Git repo & contribute
http://github.com/ISA-tools
Thanks for your attention!

More Related Content

What's hot

CV_10/17
CV_10/17CV_10/17
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
dgarijo
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
myGrid team
 
2016 davis-plantbio
2016 davis-plantbio2016 davis-plantbio
2016 davis-plantbio
c.titus.brown
 
ROHub
ROHubROHub
ROHub
Raul Palma
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
Carole Goble
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
European Bioinformatics Institute
 
Cassava genome hub
Cassava genome hubCassava genome hub
Cassava genome hub
CIAT
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
GigaScience, BGI Hong Kong
 
Mercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkMercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkBOSC 2010
 
ICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportICAR 2015 Poster - Araport
ICAR 2015 Poster - Araport
Araport
 
MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...
MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...
MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...
Jean Fan
 
VariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn LangitVariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn Langit
Data Con LA
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
dgarijo
 
2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview Leaflet2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview Leaflet
Araport
 
ICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick Provart
Araport
 
Plant ontology web services on Araport
Plant ontology web services on AraportPlant ontology web services on Araport
Plant ontology web services on Araport
Araport
 
Architecture of ContentMine Components contentmine.org
Architecture of ContentMine Components contentmine.orgArchitecture of ContentMine Components contentmine.org
Architecture of ContentMine Components contentmine.org
petermurrayrust
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenisBOSC 2010
 
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
Araport
 

What's hot (20)

CV_10/17
CV_10/17CV_10/17
CV_10/17
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
 
2016 davis-plantbio
2016 davis-plantbio2016 davis-plantbio
2016 davis-plantbio
 
ROHub
ROHubROHub
ROHub
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
Cassava genome hub
Cassava genome hubCassava genome hub
Cassava genome hub
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
Mercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkMercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_framework
 
ICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportICAR 2015 Poster - Araport
ICAR 2015 Poster - Araport
 
MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...
MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...
MERmaid - a React WebGL-Based Tool for Exploring Spatially Resolved Single-Ce...
 
VariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn LangitVariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn Langit
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
 
2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview Leaflet2015 Summer - Araport Project Overview Leaflet
2015 Summer - Araport Project Overview Leaflet
 
ICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick Provart
 
Plant ontology web services on Araport
Plant ontology web services on AraportPlant ontology web services on Araport
Plant ontology web services on Araport
 
Architecture of ContentMine Components contentmine.org
Architecture of ContentMine Components contentmine.orgArchitecture of ContentMine Components contentmine.org
Architecture of ContentMine Components contentmine.org
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenis
 
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
 

Similar to From peer-reviewed to peer-reproduced: a role for research objects in scholarly publishing in the life sciences

Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Alejandra Gonzalez-Beltran
 
Munoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMunoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMonica Munoz-Torres
 
Web Apollo Workshop University of Exeter
Web Apollo Workshop University of ExeterWeb Apollo Workshop University of Exeter
Web Apollo Workshop University of Exeter
Monica Munoz-Torres
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014
Carole Goble
 
Peer Review and Science2.0
Peer Review and Science2.0Peer Review and Science2.0
Peer Review and Science2.0
Jean-Claude Bradley
 
Experiences with logic programming in bioinformatics
Experiences with logic programming in bioinformaticsExperiences with logic programming in bioinformatics
Experiences with logic programming in bioinformatics
Chris Mungall
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
Carole Goble
 
Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014
Monica Munoz-Torres
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Monica Munoz-Torres
 
Reproducibility 1
Reproducibility 1Reproducibility 1
Reproducibility 1
Khalid Belhajjame
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
Ankit Bhardwaj
 
Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental Biology
Barry Smith
 
Assessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformaticsAssessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformaticsPeter van Heusden
 
Computational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IKComputational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IK
Ilgın Kavaklıoğulları
 
OpenSciNY Open Notebook Science
OpenSciNY Open Notebook ScienceOpenSciNY Open Notebook Science
OpenSciNY Open Notebook Science
Jean-Claude Bradley
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
Carole Goble
 
Modularity and evolvability
Modularity and evolvabilityModularity and evolvability
Modularity and evolvability
pedrobeltrao
 
Welch Wordifier Bosc2009
Welch Wordifier Bosc2009Welch Wordifier Bosc2009
Welch Wordifier Bosc2009bosc
 
Data Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionData Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tension
Paul Groth
 

Similar to From peer-reviewed to peer-reproduced: a role for research objects in scholarly publishing in the life sciences (20)

Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
 
Munoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMunoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ss
 
Web Apollo Workshop University of Exeter
Web Apollo Workshop University of ExeterWeb Apollo Workshop University of Exeter
Web Apollo Workshop University of Exeter
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014
 
Peer Review and Science2.0
Peer Review and Science2.0Peer Review and Science2.0
Peer Review and Science2.0
 
Experiences with logic programming in bioinformatics
Experiences with logic programming in bioinformaticsExperiences with logic programming in bioinformatics
Experiences with logic programming in bioinformatics
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
 
Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
 
Reproducibility 1
Reproducibility 1Reproducibility 1
Reproducibility 1
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
2015_CV_J_SHELTON_linked
2015_CV_J_SHELTON_linked2015_CV_J_SHELTON_linked
2015_CV_J_SHELTON_linked
 
Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental Biology
 
Assessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformaticsAssessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformatics
 
Computational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IKComputational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IK
 
OpenSciNY Open Notebook Science
OpenSciNY Open Notebook ScienceOpenSciNY Open Notebook Science
OpenSciNY Open Notebook Science
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
 
Modularity and evolvability
Modularity and evolvabilityModularity and evolvability
Modularity and evolvability
 
Welch Wordifier Bosc2009
Welch Wordifier Bosc2009Welch Wordifier Bosc2009
Welch Wordifier Bosc2009
 
Data Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionData Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tension
 

More from Alejandra Gonzalez-Beltran

The Software Sustainability Institute Fellowship
The Software Sustainability Institute FellowshipThe Software Sustainability Institute Fellowship
The Software Sustainability Institute Fellowship
Alejandra Gonzalez-Beltran
 
CMSO Minimal reporting requirements
CMSO Minimal reporting requirementsCMSO Minimal reporting requirements
CMSO Minimal reporting requirements
Alejandra Gonzalez-Beltran
 
The DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMedThe DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMed
Alejandra Gonzalez-Beltran
 
Datasets with bioschemas
Datasets with bioschemasDatasets with bioschemas
Datasets with bioschemas
Alejandra Gonzalez-Beltran
 
Data publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseData publication: Discover, Explore, Visualise
Data publication: Discover, Explore, Visualise
Alejandra Gonzalez-Beltran
 
ISA commons - overview and latest developments
ISA commons - overview and latest developmentsISA commons - overview and latest developments
ISA commons - overview and latest developments
Alejandra Gonzalez-Beltran
 
Metadata for Interoperable Bioscience
Metadata for Interoperable BioscienceMetadata for Interoperable Bioscience
Metadata for Interoperable Bioscience
Alejandra Gonzalez-Beltran
 
Metadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATOMetadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATO
Alejandra Gonzalez-Beltran
 
Brazil-UK Frontiers of Engineering - Big data in healthcare session
Brazil-UK Frontiers of Engineering - Big data in healthcare sessionBrazil-UK Frontiers of Engineering - Big data in healthcare session
Brazil-UK Frontiers of Engineering - Big data in healthcare session
Alejandra Gonzalez-Beltran
 
COPO kick-off meeting
COPO kick-off meetingCOPO kick-off meeting
COPO kick-off meeting
Alejandra Gonzalez-Beltran
 
UKON 2014
UKON 2014UKON 2014
NETTAB 2013
NETTAB 2013NETTAB 2013
SELENfest 2012
SELENfest 2012SELENfest 2012

More from Alejandra Gonzalez-Beltran (18)

The Software Sustainability Institute Fellowship
The Software Sustainability Institute FellowshipThe Software Sustainability Institute Fellowship
The Software Sustainability Institute Fellowship
 
CMSO Minimal reporting requirements
CMSO Minimal reporting requirementsCMSO Minimal reporting requirements
CMSO Minimal reporting requirements
 
The DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMedThe DATS model: datasets descriptions for data discovery in DataMed
The DATS model: datasets descriptions for data discovery in DataMed
 
Datasets with bioschemas
Datasets with bioschemasDatasets with bioschemas
Datasets with bioschemas
 
Data publication: Discover, Explore, Visualise
Data publication: Discover, Explore, VisualiseData publication: Discover, Explore, Visualise
Data publication: Discover, Explore, Visualise
 
ISA commons - overview and latest developments
ISA commons - overview and latest developmentsISA commons - overview and latest developments
ISA commons - overview and latest developments
 
Metadata for Interoperable Bioscience
Metadata for Interoperable BioscienceMetadata for Interoperable Bioscience
Metadata for Interoperable Bioscience
 
Metadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATOMetadata challenges research and re-usable data - BioSharing, ISA and STATO
Metadata challenges research and re-usable data - BioSharing, ISA and STATO
 
Brazil-UK Frontiers of Engineering - Big data in healthcare session
Brazil-UK Frontiers of Engineering - Big data in healthcare sessionBrazil-UK Frontiers of Engineering - Big data in healthcare session
Brazil-UK Frontiers of Engineering - Big data in healthcare session
 
COPO kick-off meeting
COPO kick-off meetingCOPO kick-off meeting
COPO kick-off meeting
 
UKON 2014
UKON 2014UKON 2014
UKON 2014
 
NETTAB 2013
NETTAB 2013NETTAB 2013
NETTAB 2013
 
Beyond the PDF 2, 2013
Beyond the PDF 2, 2013Beyond the PDF 2, 2013
Beyond the PDF 2, 2013
 
BCU 2013
BCU 2013BCU 2013
BCU 2013
 
CSHALS 2013
CSHALS 2013CSHALS 2013
CSHALS 2013
 
SELENfest 2012
SELENfest 2012SELENfest 2012
SELENfest 2012
 
NETTAB 2012
NETTAB 2012NETTAB 2012
NETTAB 2012
 
Drug Discovery- ELRIG -2012
Drug Discovery- ELRIG -2012Drug Discovery- ELRIG -2012
Drug Discovery- ELRIG -2012
 

Recently uploaded

GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
alishadewangan1
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
NoelManyise1
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Studia Poinsotiana
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 

Recently uploaded (20)

GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 

From peer-reviewed to peer-reproduced: a role for research objects in scholarly publishing in the life sciences

  • 1. From peer-reviewed to peer-reproduced: a role for research objects in scholarly publishing in the life sciences Alejandra González-Beltrán Oxford e-Research Centre, University of Oxford -ontology.org Bioinformatics Open Source Conference (BOSC), Dublin, Ireland July 10-11 2015
  • 2.
  • 3. "AGBell Notebook" by Alexander Graham Bell. (d. 1922) - page 40-41 of Alexander Graham Bell Family Papers in the Library of Congress' Manuscript Division. Licensed under Public Domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:AGBell_Notebook.jpg#/media/File:AGBell_Notebook.jpg http://petcaretips.net/bonding-rabbit-to-pets.html Many things have been said about the challenges of science reproducibility and how it can go wrong… Difficulties when the description of the experimental steps is only available in lab notebooks and scientific articles; lack of data, lack of software tools required for analysis
  • 4. Can data models and computational workflows help in capturing the experimental processes and reproduce findings? How? experimental description (design & steps) conclusions computational workflows aggregation & workflow preservation
  • 5. Can data models and computational workflows help in capturing the experimental processes and reproduce findings? How?
  • 6. Can data models and computational workflows help in capturing the experimental processes and reproduce findings? How?
  • 7. Can data models and computational workflows help in capturing the experimental processes and reproduce findings? How?
  • 8. • open peer-review • availability of • data • analysis scripts • documentation Evaluation of SOAPdenovo2 tool for the de novo assembly of genomes from small DNA segments reads by next generation sequencing, implementing improvements over SOAPdenovo1 assembler. pre-publication history https://github.com/aquaskyline/SOAPdenovo2 http://sourceforge.net/projects/soapdenovo2/
  • 10. Experimental Description EXCELERATE interoperability component http://www.ncbi.nlm.nih.gov/books/NBK279831/ http://elixir-uk.org/interoperability-infrastructure
  • 11. genome assembly algorithm genome size Predictor Variables (Factor Name, Factor Type) The experimental plan - computational case
  • 12. genome assembly algorithm genome size SOAPdenovo2 SOAPdenovo1 ALL-PATHS-LG bacterial genome insect genome human genome Predictor Variables (Factor Name, Factor Type) The experimental plan - computational case
  • 13. genome assembly algorithm genome size SOAPdenovo2 SOAPdenovo1 ALL-PATHS-LG bacterial genome insect genome human genome bacterial genome insect genome human genome bacterial genome insect genome human genome Predictor Variables (Factor Name, Factor Type) 3x3 factorial design 9 study groups The experimental plan - computational case
  • 14. genome assembly algorithm genome size SOAPdenovo2 SOAPdenovo1 ALL-PATHS-LG bacterial genome insect genome human genome bacterial genome insect genome human genome bacterial genome insect genome human genome Predictor Variables (Factor Name, Factor Type) The experimental plan - computational case S. aureus R. sphaeroides B. impatiens Chinese Han genome (orYH genome)
  • 15. genome assembly algorithm genome size SOAPdenovo2 SOAPdenovo1 ALL-PATHS-LG bacterial genome insect genome human genome bacterial genome insect genome human genome bacterial genome insect genome human genome Predictor Variables (Factor Name, Factor Type) The experimental plan - computational case Response Variables (with units) genome coverage (%) computation run time (h) peak memory consumption (Gb) contig N50 (kb or bp) scaffold N50 (kb or bp) number of errors
  • 16. The experimental steps Unambiguous identification of resources (e.g. record from public repositories); persistent identifiers if available (ORCIDs, DOIs); we suggest a dedicated article section Experimental workflows - identification of processes, their inputs and outputs Experimental design: identify experimental goal, independent and response variables
  • 17. The experimental steps Unambiguous identification of resources (e.g. record from public repositories); persistent identifiers if available (ORCIDs, DOIs); dedicated article section Experimental workflows - identification of processes, their inputs and outputs Experimental design: identify experimental goal, independent and response variables
  • 18. Reproducing SOAPdenovo2 results with Galaxy workflows S. aureus pipeline
  • 19. Reproducing SOAPdenovo2 results with Galaxy workflows S. aureus pipeline
  • 20. 2241 400 30 119.0 11 106 24 68 0 Reproducing SOAPdenovo2 results with Galaxy workflows
  • 21. Publishing findings as nanopublications assertion provenance publication info nanopublication A NP represents structured data along with its provenance in a single publishable and citable entity
  • 22. Publishing findings as nanopublications assertion provenance publication info nanopublication A NP represents structured data along with its provenance in a single publishable and citable entity Abstract & Conclusions assertion provenance Generation of nanopublications for all the results of the response variables NanoMaton templates for nanopublications Prevent priming; report all findings corresponding to the identified response variables Remain neutral and report all findings of similar importance with the same weight
  • 23. “genome coverage increased over the human data when comparing SOAPdenovo2 against SOAPdenovo1” Link conclusions to experimental description
  • 24. http://www.researchobject.org/ Aggregation and workflow preservation as ResearchObject: enables the aggregation of the digital resources contributing to findings of computational research, including results, data and software, as citable compound digital objects
  • 25. http://isa-tools.github.io/soapdenovo2 Aggregation and workflow preservation as http://www.researchobject.org/
  • 26. From narrative to self-described structured data Model & workflow assisted experimental description and review process Depth and breadth of semantic resources, clear meaning of experimental elements
  • 27. Ruibang Luo, University of Hong Kong Tin-Lap Lee, Chinese University of Hong Kong Tak-wah Lam, University of Hong Kong SOAPdenovo2 Scott Edmunds, GigaScience Peter Li, GigaScience Marco Roos, Leiden University Mark Thompson, Leiden University Rajaram Kaliyaperumal, Leiden University Eelke van der Horst, Leiden University Jun Zhao, Lancaster University María Susana Avila García, Oxford University Philippe Rocca-Serra, Oxford University Susanna-Assunta Sansone, Oxford University Alejandra Gonzalez-Beltran, Oxford University Team
  • 28. Questions? You can email us... isatools@googlegroups.com View our blog http://isatools.wordpress.com Follow us onTwitter @isatools View our websites View our Git repo & contribute http://github.com/ISA-tools Thanks for your attention!