SlideShare a Scribd company logo
1 of 28
ISA-Tab as a COSMOS
standard
Metabolomics Data Standards and Capture Workshop
Metabolomics Society Meeting 2014, Tsuruoka, Japan
Philippe Rocca-Serra (PhD)
University of Oxford e-Research Centre
Data exchange, Let information flow!
• Tenets of Science: reproducibility of results and findings
• justifies the right to access data
• publishing a manuscript is no longer enough
• data should be published and released along side
• A GEO or an ArrayExpress for Metabolomic Data
• What would you do if you had access to 25000 studies in
Metabolomics today?
Data Provenance and Preservation
It is all about structuring experimental information to make it available to
computer and software agents to enable:
Notes in Lab Books
(information for humans)
Spreadsheets and Tables
( the compromise)
Facts as RDF statements
(information for machines)
Exchange as Main Goal
• Exchange of experimental description: the Study Plan
• description of subjects and perturbations: ISA-TAB
• Exchange of spectral acquisition file: the Raw Data
• enables review, assessment,appraisal, reuse:
MzML,nmrML
• Exchange of findings: the Results and Interpretation
• identified metabolites: Mz-TAB and Metabolite
Annotation File
The essential value of
Contextual Data or Metadata
• “Data about the Data”
–description of the data (descriptive metadata)
• Lazy way: “it is all in the file name” approach
CNL_MOA1_C2_LD_TP1_EWR.cdf
• Is this enough to understand what this experiment is
about ....5 years from now?
ISA-Tab format in a nutshell
(1)
ISA metadata specifications:
•workflow and process orientated
•compatible with checklist enforcement
•compatible with external vocabulary resources
•compatible by design with existing schemas
• Investigation File: cardinality: 1..1
–purpose: think “executive summary”
– layout: rows of key value pairs organized in blocks
– content:
• Why? general study description
• How? methods / protocol declaration
• How? variable declarations (predictor and response variables)
• Who? contact and affiliation information
• Study File: cardinality: 1..n
–layout: true header/row of record table (think “sorting, filtering of samples”)
–content:
• What? Listing all biological materials collected over the study course and their
treatments.
• Assay File: cardinality: 1..n
–layout: true header/row of record table (think “sorting, filtering of datafiles”)
–content:
• What? Listing all data acquisition events and data files collected by a given assay
and subsequent data transformations
ISA-Tab format in a nutshell
(II)
ISA syntax: Characteristics[<tag>]
Declaring and annotating an
ISA Source Name or Sample
Name
ISA syntax: Protocol REF
with sets of Parameter
Value[<tag>] resulting in
a ISA node Sample
Name
Worked example-ISA Study Sample File:
Describing Study Subjects and their
features
ISA syntax: Factor Value[<tag>]
for
reporting treatments or study
groups as a set of levels of
independent variables
Worked example - ISA Assay File:
reporting signal acquisition events
ISA Pattern for LC-MS: Splitting in 2 distinct assay tables, one per scan polarity
ISA Pattern for GC-MS: Report derivatization as an extra sample prep step
ISA Pattern for NMR:
• Different kinds of experiments, Different annotation
needs
• CIMR ISA configurations to deal with Biological
Specifics
• Clinical Context (Human as subjects)
• Non-clinical Context (=Animal as subjects)
• Plant Context (=Plants as subjects)
• In-vitro Context ( = Cell as subject)
Dealing with Diversity:
ISA configurations for ISAcreator
30/06/2013
12
In-vitro study Plant study Clinical study
https://github.com/ISA-tools/Configuration-Files
Dealing with Diversity:
Refining CIMR ISA configurations
Dealing with Diversity:
Refining CIMR ISA configurations
• Different kinds of experiments, Different annotation needs
• additional ISA assay table definitions to deal with technology
needs
• Targeted profiling or global metabolomics analysis
• liquid chromatography mass spectrometry
• gas chromatography mass spectrometry
• direct infusion mass spectrometry
• 1D /2D NMR spectroscopy
• Metabolic Flux Analysis (ongoing work with Pr Marta Cascante)
Developed to be a user friendly way
to enter standards-compliant
metadata: it has lots of features...
But these are just some of
them...we also have a data entry
wizard and an import utility...
The ISAcreator: an editor for ISA-Tab
format
https://github.com/ISA-tools/ISAcreator
The ISAcreator: an editor for ISA-Tab
format
https://github.com/ISA-tools/ISAcreator
ISAcreator features: visualizing experimental workflows
Work completed during investigation of new approach for creation of glyphs with use of
taxonomy for guidance. See Maguire et al, Taxonomy-Based Glyph Design – with a Case
Study on Visualizing Workflows of Biological Experiments, IEEE Transactions on
Visualization and Computer Graphics, 2012
This bit of code indicates you need to
invoke ISA configuration which define
expected table layout in order to
proceed
ISAcreator features: API
https://github.com/ISA-tools/ISAcreator/wiki/API
https://github.com/ISA-tools/Risa
ISAViewer: ISA-Tab viewing component on the
web
https://github.com/ISA-tools/ISATab-Viewer
ISA patterns for reporting QC
samples
Annotation Rule of Thumb: does the reported value satisfy the ‘is_a’ rule?
In this representation, QC1
would be interpreted to be
an instance of organism
whose type is a ‘vanillic acid’
=> incorrect
Improved representation:
QC1 would be interpreted to
be
an instance of chemical
compound whose type is a
‘vanillic acid’ => incorrect
acting as ‘positive control’ Furthermore,
only 2 actual
study subject
will be
accounted
for
Why does it matter?
It is all about structuring experimental information to make it available to
computer and software agents to enable:
Notes in Lab Books
(information for humans)
Spreadsheets and Tables
( the compromise)
Facts as RDF statements
(information for machines)
RDF representation of Metabolomics
Experimental information
• Query Expansion and Data Discovery
https://github.com/ISA-tools/isa2owl
RDF representation of Metabolomic
Experimental information
• Conversion of 80 % of public datasets
• Tests against case-queries report partial success
• Points to the need to enforce stricter curation rules in
order to fully benefit from the RDF representation
• Existing conversion already enables easy cohort
creation
• Ongoing work: converting MAF file to RDF
• enabling querying from experimental metadata to
chemical identities and vice-versa.
https://github.com/ISA-tools/isa2owl
Contributing to
Metabolights and ISA
• BBRSC UK-China Award & BGI funded
Hackathon
• venue: BGI Hong-Kong
• Participants:
• Metabolights/BGI/ISA/Birmingham/Hong-
Kong University
• Outcome:
• ISAtab web viewer code
• Functional Specifications & Code for
DoE Wisard API
Contributing to
Metabolights and ISA
• BBRSC UK-China Award funded Hackathon
will be back!
• 2nd Meeting to be organised
• Fancy participating? get in touch!
• isatools@googlegroups.com
Don’t miss out
• 2 Main Publishers involved in developing
Data Journals
• (Scott Edmunds (GigaScience) and
Susanna Sansone (NPG Scientific Data)
• Representatives from Metabolomics
Repository
• Get all the help you need for depositing
your data and increase the visibility of your
research!
27
Questions??
You can email us...
isatools@googlegroups.com
View our blog
http://isatools.wordpress.com
Follow us on Twitter
@isatools
View our website
http://www.isa-tools.org
Thanks for listening...
View our Git repo &
contribute
http://github.com/ISA-tools

More Related Content

What's hot

ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka IntegrationACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka IntegrationStuart Chalk
 
ICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportAraport
 
247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research Workbench247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research WorkbenchStuart Chalk
 
Plant ontology web services on Araport
Plant ontology web services on AraportPlant ontology web services on Araport
Plant ontology web services on AraportAraport
 
Tripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIIITripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIIIVivek Krishnakumar
 
HRGRN: enabling graph search and integrative analysis of Arabidopsis signalin...
HRGRN: enabling graph search and integrative analysis of Arabidopsis signalin...HRGRN: enabling graph search and integrative analysis of Arabidopsis signalin...
HRGRN: enabling graph search and integrative analysis of Arabidopsis signalin...Araport
 
ACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectStuart Chalk
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
ICAR 2015 Workshop - Agnes Chan
ICAR 2015 Workshop - Agnes ChanICAR 2015 Workshop - Agnes Chan
ICAR 2015 Workshop - Agnes ChanAraport
 
The Chemtools LaBLog
The Chemtools LaBLogThe Chemtools LaBLog
The Chemtools LaBLogCameron Neylon
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and SharingC. Tobin Magle
 
ICAR 2015 Workshop - Blake Meyers
ICAR 2015 Workshop - Blake MeyersICAR 2015 Workshop - Blake Meyers
ICAR 2015 Workshop - Blake MeyersAraport
 
Crosslinks
Crosslinks Crosslinks
Crosslinks ericmeeks
 
Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...FAIRDOM
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Stuart Chalk
 
Improving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBIImproving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBIMartin Scharm
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Carole Goble
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...Carole Goble
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...Carole Goble
 

What's hot (20)

ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka IntegrationACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
 
ICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportICAR 2015 Poster - Araport
ICAR 2015 Poster - Araport
 
247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research Workbench247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research Workbench
 
Plant ontology web services on Araport
Plant ontology web services on AraportPlant ontology web services on Araport
Plant ontology web services on Araport
 
Tripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIIITripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIII
 
HRGRN: enabling graph search and integrative analysis of Arabidopsis signalin...
HRGRN: enabling graph search and integrative analysis of Arabidopsis signalin...HRGRN: enabling graph search and integrative analysis of Arabidopsis signalin...
HRGRN: enabling graph search and integrative analysis of Arabidopsis signalin...
 
ACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP Project
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
ICAR 2015 Workshop - Agnes Chan
ICAR 2015 Workshop - Agnes ChanICAR 2015 Workshop - Agnes Chan
ICAR 2015 Workshop - Agnes Chan
 
The Chemtools LaBLog
The Chemtools LaBLogThe Chemtools LaBLog
The Chemtools LaBLog
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and Sharing
 
ICAR 2015 Workshop - Blake Meyers
ICAR 2015 Workshop - Blake MeyersICAR 2015 Workshop - Blake Meyers
ICAR 2015 Workshop - Blake Meyers
 
Crosslinks
Crosslinks Crosslinks
Crosslinks
 
Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
 
ROHub
ROHubROHub
ROHub
 
Improving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBIImproving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBI
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 

Viewers also liked

BioSharing - mapping the landscape of Standards, Databases and Data policies ...
BioSharing - mapping the landscape of Standards, Databases and Data policies ...BioSharing - mapping the landscape of Standards, Databases and Data policies ...
BioSharing - mapping the landscape of Standards, Databases and Data policies ...Peter McQuilton
 
Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...
Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...
Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...GigaScience, BGI Hong Kong
 
Met soc15 roccaserra-biocrates-datasharing
Met soc15 roccaserra-biocrates-datasharingMet soc15 roccaserra-biocrates-datasharing
Met soc15 roccaserra-biocrates-datasharingPhilippe Rocca-Serra
 
Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Fo...
Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Fo...Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Fo...
Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Fo...Philippe Rocca-Serra
 
Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Philippe Rocca-Serra
 

Viewers also liked (6)

BioSharing - mapping the landscape of Standards, Databases and Data policies ...
BioSharing - mapping the landscape of Standards, Databases and Data policies ...BioSharing - mapping the landscape of Standards, Databases and Data policies ...
BioSharing - mapping the landscape of Standards, Databases and Data policies ...
 
Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...
Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...
Scott Edmunds at #GAMe2017: GigaGalaxy & publishing workflows for publishing ...
 
Met soc15 roccaserra-biocrates-datasharing
Met soc15 roccaserra-biocrates-datasharingMet soc15 roccaserra-biocrates-datasharing
Met soc15 roccaserra-biocrates-datasharing
 
TranSMART ISA-june2012
TranSMART ISA-june2012TranSMART ISA-june2012
TranSMART ISA-june2012
 
Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Fo...
Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Fo...Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Fo...
Modeling a Microbial Community and Biodiversity Assay with OBI and PCO OBO Fo...
 
Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3
 

Similar to ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

GARNet workshop on Integrating Large Data into Plant Science
GARNet workshop on Integrating Large Data into Plant ScienceGARNet workshop on Integrating Large Data into Plant Science
GARNet workshop on Integrating Large Data into Plant ScienceDavid Johnson
 
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Alejandra Gonzalez-Beltran
 
FAIR data and model management for systems biology (and SOPs too!)
FAIR data and model management for systems biology (and SOPs too!)FAIR data and model management for systems biology (and SOPs too!)
FAIR data and model management for systems biology (and SOPs too!)FAIRDOM
 
How to expose research data in EOSC
How to expose research data in EOSCHow to expose research data in EOSC
How to expose research data in EOSCEUDAT
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Ken Karapetyan
 
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...OSTHUS
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research ObjectsCarole Goble
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningAnubhav Jain
 
A Guide for Reproducible Research
A Guide for Reproducible ResearchA Guide for Reproducible Research
A Guide for Reproducible ResearchYasmin AlNoamany, PhD
 
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014Susanna-Assunta Sansone
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilitiesIan Foster
 
Fairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesFairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesPistoia Alliance
 
SPARQL and Linked Data Benchmarking
SPARQL and Linked Data BenchmarkingSPARQL and Linked Data Benchmarking
SPARQL and Linked Data BenchmarkingKristian Alexander
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodDuncan Hull
 
Publication of raw and curated NMR spectroscopic data for organic molecules
Publication of raw and curated NMR spectroscopic data for organic moleculesPublication of raw and curated NMR spectroscopic data for organic molecules
Publication of raw and curated NMR spectroscopic data for organic moleculesChristoph Steinbeck
 

Similar to ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan (20)

GARNet workshop on Integrating Large Data into Plant Science
GARNet workshop on Integrating Large Data into Plant ScienceGARNet workshop on Integrating Large Data into Plant Science
GARNet workshop on Integrating Large Data into Plant Science
 
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
 
COPO kick-off meeting
COPO kick-off meetingCOPO kick-off meeting
COPO kick-off meeting
 
Hcls sci disc-isa2rdf
Hcls sci disc-isa2rdfHcls sci disc-isa2rdf
Hcls sci disc-isa2rdf
 
FAIR data and model management for systems biology (and SOPs too!)
FAIR data and model management for systems biology (and SOPs too!)FAIR data and model management for systems biology (and SOPs too!)
FAIR data and model management for systems biology (and SOPs too!)
 
How to expose research data in EOSC
How to expose research data in EOSCHow to expose research data in EOSC
How to expose research data in EOSC
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data mining
 
A Guide for Reproducible Research
A Guide for Reproducible ResearchA Guide for Reproducible Research
A Guide for Reproducible Research
 
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilities
 
Fairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesFairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matrices
 
SPARQL and Linked Data Benchmarking
SPARQL and Linked Data BenchmarkingSPARQL and Linked Data Benchmarking
SPARQL and Linked Data Benchmarking
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific Method
 
The Genopolis Microarray database
The Genopolis Microarray databaseThe Genopolis Microarray database
The Genopolis Microarray database
 
Publication of raw and curated NMR spectroscopic data for organic molecules
Publication of raw and curated NMR spectroscopic data for organic moleculesPublication of raw and curated NMR spectroscopic data for organic molecules
Publication of raw and curated NMR spectroscopic data for organic molecules
 

Recently uploaded

The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)DHURKADEVIBASKAR
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxEran Akiva Sinbar
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555kikilily0909
 

Recently uploaded (20)

The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555
 

ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

  • 1. ISA-Tab as a COSMOS standard Metabolomics Data Standards and Capture Workshop Metabolomics Society Meeting 2014, Tsuruoka, Japan Philippe Rocca-Serra (PhD) University of Oxford e-Research Centre
  • 2. Data exchange, Let information flow! • Tenets of Science: reproducibility of results and findings • justifies the right to access data • publishing a manuscript is no longer enough • data should be published and released along side • A GEO or an ArrayExpress for Metabolomic Data • What would you do if you had access to 25000 studies in Metabolomics today?
  • 3. Data Provenance and Preservation It is all about structuring experimental information to make it available to computer and software agents to enable: Notes in Lab Books (information for humans) Spreadsheets and Tables ( the compromise) Facts as RDF statements (information for machines)
  • 4. Exchange as Main Goal • Exchange of experimental description: the Study Plan • description of subjects and perturbations: ISA-TAB • Exchange of spectral acquisition file: the Raw Data • enables review, assessment,appraisal, reuse: MzML,nmrML • Exchange of findings: the Results and Interpretation • identified metabolites: Mz-TAB and Metabolite Annotation File
  • 5. The essential value of Contextual Data or Metadata • “Data about the Data” –description of the data (descriptive metadata) • Lazy way: “it is all in the file name” approach CNL_MOA1_C2_LD_TP1_EWR.cdf • Is this enough to understand what this experiment is about ....5 years from now?
  • 6. ISA-Tab format in a nutshell (1) ISA metadata specifications: •workflow and process orientated •compatible with checklist enforcement •compatible with external vocabulary resources •compatible by design with existing schemas
  • 7. • Investigation File: cardinality: 1..1 –purpose: think “executive summary” – layout: rows of key value pairs organized in blocks – content: • Why? general study description • How? methods / protocol declaration • How? variable declarations (predictor and response variables) • Who? contact and affiliation information • Study File: cardinality: 1..n –layout: true header/row of record table (think “sorting, filtering of samples”) –content: • What? Listing all biological materials collected over the study course and their treatments. • Assay File: cardinality: 1..n –layout: true header/row of record table (think “sorting, filtering of datafiles”) –content: • What? Listing all data acquisition events and data files collected by a given assay and subsequent data transformations ISA-Tab format in a nutshell (II)
  • 8. ISA syntax: Characteristics[<tag>] Declaring and annotating an ISA Source Name or Sample Name ISA syntax: Protocol REF with sets of Parameter Value[<tag>] resulting in a ISA node Sample Name Worked example-ISA Study Sample File: Describing Study Subjects and their features ISA syntax: Factor Value[<tag>] for reporting treatments or study groups as a set of levels of independent variables
  • 9. Worked example - ISA Assay File: reporting signal acquisition events ISA Pattern for LC-MS: Splitting in 2 distinct assay tables, one per scan polarity ISA Pattern for GC-MS: Report derivatization as an extra sample prep step ISA Pattern for NMR:
  • 10.
  • 11. • Different kinds of experiments, Different annotation needs • CIMR ISA configurations to deal with Biological Specifics • Clinical Context (Human as subjects) • Non-clinical Context (=Animal as subjects) • Plant Context (=Plants as subjects) • In-vitro Context ( = Cell as subject) Dealing with Diversity: ISA configurations for ISAcreator
  • 12. 30/06/2013 12 In-vitro study Plant study Clinical study https://github.com/ISA-tools/Configuration-Files Dealing with Diversity: Refining CIMR ISA configurations
  • 13. Dealing with Diversity: Refining CIMR ISA configurations • Different kinds of experiments, Different annotation needs • additional ISA assay table definitions to deal with technology needs • Targeted profiling or global metabolomics analysis • liquid chromatography mass spectrometry • gas chromatography mass spectrometry • direct infusion mass spectrometry • 1D /2D NMR spectroscopy • Metabolic Flux Analysis (ongoing work with Pr Marta Cascante)
  • 14. Developed to be a user friendly way to enter standards-compliant metadata: it has lots of features... But these are just some of them...we also have a data entry wizard and an import utility... The ISAcreator: an editor for ISA-Tab format https://github.com/ISA-tools/ISAcreator
  • 15. The ISAcreator: an editor for ISA-Tab format https://github.com/ISA-tools/ISAcreator
  • 16. ISAcreator features: visualizing experimental workflows Work completed during investigation of new approach for creation of glyphs with use of taxonomy for guidance. See Maguire et al, Taxonomy-Based Glyph Design – with a Case Study on Visualizing Workflows of Biological Experiments, IEEE Transactions on Visualization and Computer Graphics, 2012
  • 17. This bit of code indicates you need to invoke ISA configuration which define expected table layout in order to proceed ISAcreator features: API https://github.com/ISA-tools/ISAcreator/wiki/API
  • 19. ISAViewer: ISA-Tab viewing component on the web https://github.com/ISA-tools/ISATab-Viewer
  • 20. ISA patterns for reporting QC samples Annotation Rule of Thumb: does the reported value satisfy the ‘is_a’ rule? In this representation, QC1 would be interpreted to be an instance of organism whose type is a ‘vanillic acid’ => incorrect Improved representation: QC1 would be interpreted to be an instance of chemical compound whose type is a ‘vanillic acid’ => incorrect acting as ‘positive control’ Furthermore, only 2 actual study subject will be accounted for
  • 21. Why does it matter? It is all about structuring experimental information to make it available to computer and software agents to enable: Notes in Lab Books (information for humans) Spreadsheets and Tables ( the compromise) Facts as RDF statements (information for machines)
  • 22. RDF representation of Metabolomics Experimental information • Query Expansion and Data Discovery https://github.com/ISA-tools/isa2owl
  • 23. RDF representation of Metabolomic Experimental information • Conversion of 80 % of public datasets • Tests against case-queries report partial success • Points to the need to enforce stricter curation rules in order to fully benefit from the RDF representation • Existing conversion already enables easy cohort creation • Ongoing work: converting MAF file to RDF • enabling querying from experimental metadata to chemical identities and vice-versa. https://github.com/ISA-tools/isa2owl
  • 24. Contributing to Metabolights and ISA • BBRSC UK-China Award & BGI funded Hackathon • venue: BGI Hong-Kong • Participants: • Metabolights/BGI/ISA/Birmingham/Hong- Kong University • Outcome: • ISAtab web viewer code • Functional Specifications & Code for DoE Wisard API
  • 25. Contributing to Metabolights and ISA • BBRSC UK-China Award funded Hackathon will be back! • 2nd Meeting to be organised • Fancy participating? get in touch! • isatools@googlegroups.com
  • 26. Don’t miss out • 2 Main Publishers involved in developing Data Journals • (Scott Edmunds (GigaScience) and Susanna Sansone (NPG Scientific Data) • Representatives from Metabolomics Repository • Get all the help you need for depositing your data and increase the visibility of your research!
  • 27. 27
  • 28. Questions?? You can email us... isatools@googlegroups.com View our blog http://isatools.wordpress.com Follow us on Twitter @isatools View our website http://www.isa-tools.org Thanks for listening... View our Git repo & contribute http://github.com/ISA-tools

Editor's Notes

  1. Applying a Protocol is reported by adding a “Protocol REF” fields, which can be qualified by associated “Parameter values” as well as a field “Performer” to track the Operator Effect and a field “Date” to track the “day effect”.
  2. Applying a Protocol is reported by adding a “Protocol REF” fields, which can be qualified by associated “Parameter values” as well as a field “Performer” to track the Operator Effect and a field “Date” to track the “day effect”.
  3. Applying a Protocol is reported by adding a “Protocol REF” fields, which can be qualified by associated “Parameter values” as well as a field “Performer” to track the Operator Effect and a field “Date” to track the “day effect”.
  4. Ecosystem revolving around the ISA-TAB format Support for massively parallel datasets Focus on a couple of the tools – OntoMaton Gradient from left to right – configuration (annotation guidelines), curation tools to analysis and usage – people can choose the path that is more convenient for their use case
  5. Once a configuration has been defined, ISAcreator Editor can read it the spreadsheet will be aware of the terminonology restrictions a set by the super user in charge of defining annotation requirements. In this screenshot, you can see the allowed values for reporting Flow cytometry instrument using OBI classes in an Flow Cytometry Assay as defined in ISAconfigurator. Note the Metadata pulled from OBI and readily avaiable for people to check the term they select is correct.
  6. Once a configuration has been defined, ISAcreator Editor can read it the spreadsheet will be aware of the terminonology restrictions a set by the super user in charge of defining annotation requirements. In this screenshot, you can see the allowed values for reporting Flow cytometry instrument using OBI classes in an Flow Cytometry Assay as defined in ISAconfigurator. Note the Metadata pulled from OBI and readily avaiable for people to check the term they select is correct.