SlideShare a Scribd company logo
1 of 47
Bioinformatics for lipidomics: putting 
some building blocks together 
Dr. Juan Antonio Vizcaíno 
EMBL-EBI 
Hinxton, Cambridge, UK
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
Overview 
• A bit of general context… 
• Data standards: mzTab (and mzML) 
• Standard nomenclature 
• Public repository: MetaboLights 
• Specialist resource: LipidHome
Some of the main bioinformatics building blocks 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
Data standards 
Databases, data 
repositories 
Stable identifiers for 
molecules 
Infrastructure to store and 
access the information 
Nothing new… Lipidomics (metabolomics) is following the steps of other disciplines
Bioinformatics infrastructure 
Usually, we will not realize they are there… unless something does not work 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
Overview 
• A bit of general context… 
• Data standards: mzTab (and mzML) 
• Standard nomenclature 
• Public repository: MetaboLights 
• Specialist resource: LipidHome
Data standards are needed 
Standards are needed in life: also in bioinformatics… 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
With a small number 
of standards, 
data converters are feasible 
4th European Lipidomic meeting 
Graz, 24 September 2014
Metabolomics Standards Initiative 2007 publications 
Not much adoption happened in practise… 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
Roy Goodacre Metabolomics (2014) 10:5-7 
4th European Lipidomic meeting 
Graz, 24 September 2014
Situation at the field 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
Lab 1 Lab 2 Lab 3 … 
LipidXplorer LDA ALEX Others 
4th European Lipidomic meeting 
Graz, 24 September 2014 
… 
Different output files from different tools 
How can these results coming from different groups be easily compared? 
(also applicable to visualization, storage, …)
Situation at the field 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
Lab 1 Lab 2 Lab 3 … 
LipidXplorer LDA ALEX Others 
Converters 
4th European Lipidomic meeting 
Graz, 24 September 2014 
… 
Different output files from different tools 
mzTab Common analysis/visualization tools
The mzTab format 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
http://code.google.com/p/mztab/ 
4th European Lipidomic meeting 
Graz, 24 September 2014
mzTab – Aims and concept 
• To provide a simple and efficient way of exchanging results from MS 
approaches. 
• Simple summary report of the experimental results 
• Peptides and proteins identified in a given experimental setting 
• Small molecules identified 
• Reported quantification values 
• Technical and biological metadata 
• Easier to update and maintain, and flexible enough. 
• Easier to parse and use by the research community, systems 
biologists as well as providers of knowledge bases. 
• It can be used by non-experts in bioinformatics. 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
Why a tab-delimited file? 
• An effective use of the XML based formats in the proteomics field 
(mzIdentML, mzQuantML) requires sophisticated bioinformatics 
expertise. 
• No alternative was available for metabolomics results… 
• Many researchers are still used to use MS Excel to “look” or 
exchange their data. 
• The transcriptomics field has a widely used standard tab-delimited 
file format (MAGE-TAB) for exchanging data. The format MI TAB 
has also been a success in the molecular interaction field. 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
mzTab –Format Specification (version 1.0.0) 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
• Five sections: 
• (Optional) Metadata section 
• (Optional) Protein section 
• (Optional) Peptide section 
• (Optional) PSM (Peptide Spectrum Match) version 
• (Optional) Small Molecule section 
• Can report experimental design to a high detail level.
mzTab – Metadata Section 
• It provides additional information about the dataset. It consists 
of key- value pairs. 
• Extensive use of CVs/ontologies. 
•Different requirements depending on the file mode (‘summary’ 
or ‘complete’) and type (‘identification’ or ‘quantification’). 
• Support for experimental design (very similar to mzQuantML). 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
mzTab – Metadata Section 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
mzTab – Small Molecule Table 
• Main contents: 
• Identifier 
• Unit-ID 
• Chemical formula 
• SMILES identifier 
• InChi identifier 
• Descriptive name 
• Mass to charge 
• Charge and retention time 
• Tax ID and species name 
• Spectral library name + version 
• Software name + version 
• Relative or absolute quantification values 
• Reference to the spectrum ID in an external file (i.e. mzML), 
… 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
mzTab – Small Molecule Section 
• It contains mandatory and optional fields. 
• It is possible to link with the external mass spectra. 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
mzTab – Current implementations 
• jmzTab (Java API): Version 3.0 is now a stable version. Manuscript 
published in the journal Proteomics. 
• mzTab Validator, PRIDE XML to mzTab converter (PRIDE team). 
• mzIdentML and mzQuantML to mzTab converters (Andy Jones 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
group). 
• MaxQuant: exporter in beta is available. 
• OpenMS (version 1.10). 
• R/Bioconductor package Msnbase (L. Gatto, Cambridge University). 
• LipidDataAnalyzer (J. Hartler, University of Graz, see next talk). 
• Metabolights (EBI).
Implementation in Lipid Data Analyzer 
• In collaboration with TU of Graz. 
• mzTab export support is available from v1.6 (May 2012) 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
mzTab format publications 
J. Griss et al., MCP, 2014 
http://code.google.com/p/mztab/ 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
Q.W. Xu et al., Proteomics, 2014
• COordination of Standards in MetabOlomicS 
• Started October 2012 
• 14 European partners 
• World wide collaborators 
• Standards!! 
• Data exchange 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
COSMOS: EU FP7 project 
4th European Lipidomic meeting 
Graz, 24 September 2014 
• Opensource 
http://www.cosmos-fp7.eu/
mzTab in Mx: extension ongoing 
•Meeting in Tuebingen to extend mzTab for metabolomics 
(March 2014). 
•NEW! 3 Tables for SM (analogous to Proteins) 
1)SmallMoleculeList 
2)SmallMoleculeFeatures 
3)SmallMoleculeEvidence 
Example file exists at 
https://github.com/sneumann/mtbls2/faahKO.mzTab 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
mzML: Standard for MS data 
• A data format for the storage and exchange of MS output files 
• Originally designed for proteomics by merging the best aspects of 
both mzData and mzXML 
• Developed with full participation of academic researchers, hardware 
and software vendors 
• For both raw data and processed peaks. 
• Version 1.1 released in June 2009 
• Many implementations already exist in the proteomics world 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
mzML for Metabolomics 
•A no-brainer. No need to reinvent the wheel 
•No schema change required. 
•But in next documentation update: 
1.Describe multidimensional retention time 
(GCxGC/MS, LCxLC/MS and LC-IMS/MS) 
2.Describe tools for conversion 
(especially the GC world) 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
Data standards in MS for metabolomics 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
Steffen Neumann
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
Overview 
• A bit of general context… 
• Data standards: mzTab (and mzML) 
• Standard nomenclature 
• Public repository: MetaboLights and COSMOS 
• Specialist resource: LipidHome
Situation at the field 
•Very challenging to share experimental results efficiently: 
•No standard data format for experimental results (Excel 
spreadsheets are routinely used). 
•Lipid species are called in a slightly different way by 
different groups and the level of detail also varies. 
•This situation is maybe good enough for human consumption, 
but not for computers. This hinders the development of: 
•Analysis tools 
•Data repositories 
•LIMS systems 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
Standard LipidomicNet Nomenclature 
G. Liebisch et al., JLR, 2013 
• Address some limitations of LIPID MAPS (de facto standard 
nomenclature) for high-throughput lipid MS approaches 
• Enabling different levels of resolution for lipid species (needed to 
add clarification to the data) 
• Suitable for bioinformatics approaches (used in LipidHome) 
• Includes at present the main lipid classes (from FA to Sterols). 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
Nomenclature Structural Hierarchy 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
Overview 
• A bit of general context… 
• Data standards: mzTab (and mzML) 
• Standard nomenclature 
• Public repository: MetaboLights 
• Specialist resource: LipidHome
• In some ‘omics’ fields, data sharing ‘culture’ is well established. 
Generally, it is considered to be a good scientific practise. 
• In metabolomics (lipidomics), that ‘culture’ is not there yet. 
• Public availability of data enables: 
• Reinterpretation. 
• validation of the experimental results reported. 
• reuse of the data (e.g. for meta-analysis studies). 
• Specific use cases for metabolomics (lipidomics): e.g. 
development of MRM assays, spectral libraries, 
fragmentation models,…etc. 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
Data sharing in Biology 
4th European Lipidomic meeting 
Graz, 24 September 2014
MetaboLights – metabolomics repository 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
www.ebi.ac.uk/metabolights 
(metabolights.org, metabolights.eu) 
4th European Lipidomic meeting 
Graz, 24 September 2014
MetaboLights – Data types stored 
• Primary research data 
• Investigation, Study, Assay and Protocols (metadata) 
• Instrument and analytical software output (raw / processed) 
• Metabolite references, QC, Blanks, … 
• Open source formats 
• Imported Reference data, for each metabolite 
• Reference data imported from external databases 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
• Chemistry, Biology, Reactions, Pathways, NMR/MS spectra, 
Literature 
4th European Lipidomic meeting 
Graz, 24 September 2014 
• Link to: 
• ChEBI, Rhea and others
MetaboLights – Private Data – Share data 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
MetabolomeXchange.org 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
Overview 
• A bit of general context… 
• Data standards: mzTab (and mzML) 
• Standard nomenclature 
• Public repository: MetaboLights 
• Specialist resource: LipidHome
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
LipidHome 
J. Foster et al., PLOS One, 2013 
www.ebi.ac.uk/apweiler-srv/lipidhome
LipidHome: executive summary 
• Provides stable identifiers for all common lipid structures. 
• Provides all theoretical lipid structures, while maintaining clear 
separation between them and experimentally validated structures. 
• Evidence based system for annotating lipids with papers. 
• A useful annotation level hierarchy that allows interrogation of the 
database from whatever results you have. E.g. Mass, structural 
fragment or empirical formula. 
• Programmatic access so that lipid identification software/ LIMS / 
analysis pipelines can be built on top of it. 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
LipidHome Structural Hierarchy 
• Lipids are stored at the 
levels described in the 
proposed LipidomicNet 
nomenclature 
• Lipid identifications can 
accurately be mapped 
to suitable records in the 
database 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
Use cases 
• What Species/Isomers are viable identifications for mass X 
with tolerance Y? 
• For species PC 36:2 what are the experimentally validated 
isomers/ Fatty acid scan species? 
• What are all the experimentally validated sub species 
containing the fatty acid species 18:2? 
• What are all the identifications validated by 
“PMID:20564011”? 
• For the mass X what is the most likely sub species based on 
previous identifications.
The data in LipidHome 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
GL 
MG 
MG 
MG O-DG 
DG 
DG O-DG 
dO-TG 
TG 
TG O-TG 
dO-TG 
tO-GP 
PC 
PC 
PC O-PC 
dO-LPC 
LPC O-PA 
PA 
PA O-PA 
dO-LPA 
LPA O-PE 
PE 
PE O-PE 
dO-LPE 
LPE O-PS 
PS 
PS O-PS 
dO-LPS 
LPS O-PI 
PI 
PI O-PI 
dO-LPI 
LPI O-PG 
PG 
PG O-PG 
dO-LPG 
LPG O-Species: 
17497 
Fatty Acid Scan species: 1821760 
Sub Species: 2140592 
Annotated Isomers: 7584 
Fatty Acid species: 164
Theoretical lipid generation 
• A set of rules were derived that describe common fatty 
acids. 
• Minimum carbons = 2 
• Maximum carbons = 30 
• Minimum double bonds = 0 
• Maximum double bonds = 10 
• Minimum gap between double bonds 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
LipidHome – Species view 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
LipidHome – MS1 search output 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014
The big picture… 
Common analysis and 
visualization software 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
Standard 
nomenclature 
Local LIMS systems 
MetaboLights 
mzTab 
mzTab importer into 
LIMS/ resource 
Different output files from different tools 
4th European Lipidomic meeting 
Graz, 24 September 2014 
Data converters 
to mzTab 
mzTab exporter from 
LIMS/ resource 
LipidXplorer LDA ALEX Others
Acknowledgements 
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
Johannes Griss 
Qing-Wei Xu 
Joe Foster 
R. Salek & C. Steinbeck 
COSMOS partners 
G. Liebisch, M. Troetzmueller, F. Spener, H. Koefeler 
& M. Wakelam 
http://code.google.com/p/mztab/ 
Jurgen Hartler 
Gerhard Thallinger 
BBSRC PROCESS grant 
Mathias Walzer 
Timo Sachsenberg 
Oliver Kohlbacher
Juan A. Vizcaíno 
juan@ebi.ac.uk 
4th European Lipidomic meeting 
Graz, 24 September 2014 
Questions?

More Related Content

Similar to Euro lipids 2014_graz

Mass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progressMass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progressJuan Antonio Vizcaino
 
The mzTab data standard format for reporting MS-based peptide, protein and sm...
The mzTab data standard format for reporting MS-based peptide, protein and sm...The mzTab data standard format for reporting MS-based peptide, protein and sm...
The mzTab data standard format for reporting MS-based peptide, protein and sm...Juan Antonio Vizcaino
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation Research Data Alliance
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation Research Data Alliance
 
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...OSTHUS
 
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...Juan Antonio Vizcaino
 
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)OpenAIRE
 
EMBL-EBI Proteomics data resources and services
EMBL-EBI Proteomics data resources and servicesEMBL-EBI Proteomics data resources and services
EMBL-EBI Proteomics data resources and servicesRafael C. Jimenez
 
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeData volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeJuan Antonio Vizcaino
 
Benchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetBenchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetGhislain Atemezing
 
Mining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsMining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsJuan Antonio Vizcaino
 
Archivematica in Czech Libraries
Archivematica in Czech LibrariesArchivematica in Czech Libraries
Archivematica in Czech Librariesdp-blog-cz
 
Giab workshop update mar2019
Giab workshop update mar2019Giab workshop update mar2019
Giab workshop update mar2019GenomeInABottle
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...Juan Antonio Vizcaino
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...Carole Goble
 

Similar to Euro lipids 2014_graz (20)

Mass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progressMass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progress
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
The mzTab data standard format for reporting MS-based peptide, protein and sm...
The mzTab data standard format for reporting MS-based peptide, protein and sm...The mzTab data standard format for reporting MS-based peptide, protein and sm...
The mzTab data standard format for reporting MS-based peptide, protein and sm...
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...Semantics for integrated laboratory analytical processes - The Allotrope Pers...
Semantics for integrated laboratory analytical processes - The Allotrope Pers...
 
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
 
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
PRIDE and ProteomeXchange
PRIDE and ProteomeXchangePRIDE and ProteomeXchange
PRIDE and ProteomeXchange
 
PRIDE-ProteomeXchange
PRIDE-ProteomeXchangePRIDE-ProteomeXchange
PRIDE-ProteomeXchange
 
EMBL-EBI Proteomics data resources and services
EMBL-EBI Proteomics data resources and servicesEMBL-EBI Proteomics data resources and services
EMBL-EBI Proteomics data resources and services
 
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeData volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
 
Benchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office DatasetBenchmarking Commercial RDF Stores with Publications Office Dataset
Benchmarking Commercial RDF Stores with Publications Office Dataset
 
Mining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsMining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasets
 
Archivematica in Czech Libraries
Archivematica in Czech LibrariesArchivematica in Czech Libraries
Archivematica in Czech Libraries
 
ProteomeXchange update HUPO 2016
ProteomeXchange update HUPO 2016ProteomeXchange update HUPO 2016
ProteomeXchange update HUPO 2016
 
Giab workshop update mar2019
Giab workshop update mar2019Giab workshop update mar2019
Giab workshop update mar2019
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
 

More from Juan Antonio Vizcaino

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Juan Antonio Vizcaino
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formatsJuan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Juan Antonio Vizcaino
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...Juan Antonio Vizcaino
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Juan Antonio Vizcaino
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...Juan Antonio Vizcaino
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateJuan Antonio Vizcaino
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Juan Antonio Vizcaino
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Juan Antonio Vizcaino
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Juan Antonio Vizcaino
 

More from Juan Antonio Vizcaino (20)

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formats
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
PRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchangePRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchange
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
 
PSI-Proteome Informatics update
PSI-Proteome Informatics updatePSI-Proteome Informatics update
PSI-Proteome Informatics update
 
ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
 
The ELIXIR Proteomics community
The ELIXIR Proteomics community The ELIXIR Proteomics community
The ELIXIR Proteomics community
 
The ELIXIR Proteomics Community
The ELIXIR Proteomics CommunityThe ELIXIR Proteomics Community
The ELIXIR Proteomics Community
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 update
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?
 

Recently uploaded

Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxnoordubaliya2003
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsHajira Mahmood
 

Recently uploaded (20)

Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptx
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Solution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutionsSolution chemistry, Moral and Normal solutions
Solution chemistry, Moral and Normal solutions
 

Euro lipids 2014_graz

  • 1. Bioinformatics for lipidomics: putting some building blocks together Dr. Juan Antonio Vizcaíno EMBL-EBI Hinxton, Cambridge, UK
  • 2. Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 Overview • A bit of general context… • Data standards: mzTab (and mzML) • Standard nomenclature • Public repository: MetaboLights • Specialist resource: LipidHome
  • 3. Some of the main bioinformatics building blocks Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 Data standards Databases, data repositories Stable identifiers for molecules Infrastructure to store and access the information Nothing new… Lipidomics (metabolomics) is following the steps of other disciplines
  • 4. Bioinformatics infrastructure Usually, we will not realize they are there… unless something does not work Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 5. Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 Overview • A bit of general context… • Data standards: mzTab (and mzML) • Standard nomenclature • Public repository: MetaboLights • Specialist resource: LipidHome
  • 6. Data standards are needed Standards are needed in life: also in bioinformatics… Juan A. Vizcaíno juan@ebi.ac.uk With a small number of standards, data converters are feasible 4th European Lipidomic meeting Graz, 24 September 2014
  • 7. Metabolomics Standards Initiative 2007 publications Not much adoption happened in practise… Juan A. Vizcaíno juan@ebi.ac.uk Roy Goodacre Metabolomics (2014) 10:5-7 4th European Lipidomic meeting Graz, 24 September 2014
  • 8. Situation at the field Juan A. Vizcaíno juan@ebi.ac.uk Lab 1 Lab 2 Lab 3 … LipidXplorer LDA ALEX Others 4th European Lipidomic meeting Graz, 24 September 2014 … Different output files from different tools How can these results coming from different groups be easily compared? (also applicable to visualization, storage, …)
  • 9. Situation at the field Juan A. Vizcaíno juan@ebi.ac.uk Lab 1 Lab 2 Lab 3 … LipidXplorer LDA ALEX Others Converters 4th European Lipidomic meeting Graz, 24 September 2014 … Different output files from different tools mzTab Common analysis/visualization tools
  • 10. The mzTab format Juan A. Vizcaíno juan@ebi.ac.uk http://code.google.com/p/mztab/ 4th European Lipidomic meeting Graz, 24 September 2014
  • 11. mzTab – Aims and concept • To provide a simple and efficient way of exchanging results from MS approaches. • Simple summary report of the experimental results • Peptides and proteins identified in a given experimental setting • Small molecules identified • Reported quantification values • Technical and biological metadata • Easier to update and maintain, and flexible enough. • Easier to parse and use by the research community, systems biologists as well as providers of knowledge bases. • It can be used by non-experts in bioinformatics. Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 12. Why a tab-delimited file? • An effective use of the XML based formats in the proteomics field (mzIdentML, mzQuantML) requires sophisticated bioinformatics expertise. • No alternative was available for metabolomics results… • Many researchers are still used to use MS Excel to “look” or exchange their data. • The transcriptomics field has a widely used standard tab-delimited file format (MAGE-TAB) for exchanging data. The format MI TAB has also been a success in the molecular interaction field. Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 13. mzTab –Format Specification (version 1.0.0) Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 • Five sections: • (Optional) Metadata section • (Optional) Protein section • (Optional) Peptide section • (Optional) PSM (Peptide Spectrum Match) version • (Optional) Small Molecule section • Can report experimental design to a high detail level.
  • 14. mzTab – Metadata Section • It provides additional information about the dataset. It consists of key- value pairs. • Extensive use of CVs/ontologies. •Different requirements depending on the file mode (‘summary’ or ‘complete’) and type (‘identification’ or ‘quantification’). • Support for experimental design (very similar to mzQuantML). Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 15. mzTab – Metadata Section Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 16. mzTab – Small Molecule Table • Main contents: • Identifier • Unit-ID • Chemical formula • SMILES identifier • InChi identifier • Descriptive name • Mass to charge • Charge and retention time • Tax ID and species name • Spectral library name + version • Software name + version • Relative or absolute quantification values • Reference to the spectrum ID in an external file (i.e. mzML), … Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 17. mzTab – Small Molecule Section • It contains mandatory and optional fields. • It is possible to link with the external mass spectra. Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 18. mzTab – Current implementations • jmzTab (Java API): Version 3.0 is now a stable version. Manuscript published in the journal Proteomics. • mzTab Validator, PRIDE XML to mzTab converter (PRIDE team). • mzIdentML and mzQuantML to mzTab converters (Andy Jones Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 group). • MaxQuant: exporter in beta is available. • OpenMS (version 1.10). • R/Bioconductor package Msnbase (L. Gatto, Cambridge University). • LipidDataAnalyzer (J. Hartler, University of Graz, see next talk). • Metabolights (EBI).
  • 19. Implementation in Lipid Data Analyzer • In collaboration with TU of Graz. • mzTab export support is available from v1.6 (May 2012) Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 20. mzTab format publications J. Griss et al., MCP, 2014 http://code.google.com/p/mztab/ Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 Q.W. Xu et al., Proteomics, 2014
  • 21. • COordination of Standards in MetabOlomicS • Started October 2012 • 14 European partners • World wide collaborators • Standards!! • Data exchange Juan A. Vizcaíno juan@ebi.ac.uk COSMOS: EU FP7 project 4th European Lipidomic meeting Graz, 24 September 2014 • Opensource http://www.cosmos-fp7.eu/
  • 22. mzTab in Mx: extension ongoing •Meeting in Tuebingen to extend mzTab for metabolomics (March 2014). •NEW! 3 Tables for SM (analogous to Proteins) 1)SmallMoleculeList 2)SmallMoleculeFeatures 3)SmallMoleculeEvidence Example file exists at https://github.com/sneumann/mtbls2/faahKO.mzTab Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 23. mzML: Standard for MS data • A data format for the storage and exchange of MS output files • Originally designed for proteomics by merging the best aspects of both mzData and mzXML • Developed with full participation of academic researchers, hardware and software vendors • For both raw data and processed peaks. • Version 1.1 released in June 2009 • Many implementations already exist in the proteomics world Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 24. mzML for Metabolomics •A no-brainer. No need to reinvent the wheel •No schema change required. •But in next documentation update: 1.Describe multidimensional retention time (GCxGC/MS, LCxLC/MS and LC-IMS/MS) 2.Describe tools for conversion (especially the GC world) Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 25. Data standards in MS for metabolomics Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 Steffen Neumann
  • 26. Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 Overview • A bit of general context… • Data standards: mzTab (and mzML) • Standard nomenclature • Public repository: MetaboLights and COSMOS • Specialist resource: LipidHome
  • 27. Situation at the field •Very challenging to share experimental results efficiently: •No standard data format for experimental results (Excel spreadsheets are routinely used). •Lipid species are called in a slightly different way by different groups and the level of detail also varies. •This situation is maybe good enough for human consumption, but not for computers. This hinders the development of: •Analysis tools •Data repositories •LIMS systems Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 28. Standard LipidomicNet Nomenclature G. Liebisch et al., JLR, 2013 • Address some limitations of LIPID MAPS (de facto standard nomenclature) for high-throughput lipid MS approaches • Enabling different levels of resolution for lipid species (needed to add clarification to the data) • Suitable for bioinformatics approaches (used in LipidHome) • Includes at present the main lipid classes (from FA to Sterols). Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 29. Nomenclature Structural Hierarchy Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 30. Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 Overview • A bit of general context… • Data standards: mzTab (and mzML) • Standard nomenclature • Public repository: MetaboLights • Specialist resource: LipidHome
  • 31. • In some ‘omics’ fields, data sharing ‘culture’ is well established. Generally, it is considered to be a good scientific practise. • In metabolomics (lipidomics), that ‘culture’ is not there yet. • Public availability of data enables: • Reinterpretation. • validation of the experimental results reported. • reuse of the data (e.g. for meta-analysis studies). • Specific use cases for metabolomics (lipidomics): e.g. development of MRM assays, spectral libraries, fragmentation models,…etc. Juan A. Vizcaíno juan@ebi.ac.uk Data sharing in Biology 4th European Lipidomic meeting Graz, 24 September 2014
  • 32. MetaboLights – metabolomics repository Juan A. Vizcaíno juan@ebi.ac.uk www.ebi.ac.uk/metabolights (metabolights.org, metabolights.eu) 4th European Lipidomic meeting Graz, 24 September 2014
  • 33. MetaboLights – Data types stored • Primary research data • Investigation, Study, Assay and Protocols (metadata) • Instrument and analytical software output (raw / processed) • Metabolite references, QC, Blanks, … • Open source formats • Imported Reference data, for each metabolite • Reference data imported from external databases Juan A. Vizcaíno juan@ebi.ac.uk • Chemistry, Biology, Reactions, Pathways, NMR/MS spectra, Literature 4th European Lipidomic meeting Graz, 24 September 2014 • Link to: • ChEBI, Rhea and others
  • 34. MetaboLights – Private Data – Share data Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 35. MetabolomeXchange.org Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 36. Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 Overview • A bit of general context… • Data standards: mzTab (and mzML) • Standard nomenclature • Public repository: MetaboLights • Specialist resource: LipidHome
  • 37. Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 LipidHome J. Foster et al., PLOS One, 2013 www.ebi.ac.uk/apweiler-srv/lipidhome
  • 38. LipidHome: executive summary • Provides stable identifiers for all common lipid structures. • Provides all theoretical lipid structures, while maintaining clear separation between them and experimentally validated structures. • Evidence based system for annotating lipids with papers. • A useful annotation level hierarchy that allows interrogation of the database from whatever results you have. E.g. Mass, structural fragment or empirical formula. • Programmatic access so that lipid identification software/ LIMS / analysis pipelines can be built on top of it. Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 39. LipidHome Structural Hierarchy • Lipids are stored at the levels described in the proposed LipidomicNet nomenclature • Lipid identifications can accurately be mapped to suitable records in the database Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 40. Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 Use cases • What Species/Isomers are viable identifications for mass X with tolerance Y? • For species PC 36:2 what are the experimentally validated isomers/ Fatty acid scan species? • What are all the experimentally validated sub species containing the fatty acid species 18:2? • What are all the identifications validated by “PMID:20564011”? • For the mass X what is the most likely sub species based on previous identifications.
  • 41. The data in LipidHome Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 GL MG MG MG O-DG DG DG O-DG dO-TG TG TG O-TG dO-TG tO-GP PC PC PC O-PC dO-LPC LPC O-PA PA PA O-PA dO-LPA LPA O-PE PE PE O-PE dO-LPE LPE O-PS PS PS O-PS dO-LPS LPS O-PI PI PI O-PI dO-LPI LPI O-PG PG PG O-PG dO-LPG LPG O-Species: 17497 Fatty Acid Scan species: 1821760 Sub Species: 2140592 Annotated Isomers: 7584 Fatty Acid species: 164
  • 42. Theoretical lipid generation • A set of rules were derived that describe common fatty acids. • Minimum carbons = 2 • Maximum carbons = 30 • Minimum double bonds = 0 • Maximum double bonds = 10 • Minimum gap between double bonds Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 43. LipidHome – Species view Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 44. LipidHome – MS1 search output Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014
  • 45. The big picture… Common analysis and visualization software Juan A. Vizcaíno juan@ebi.ac.uk Standard nomenclature Local LIMS systems MetaboLights mzTab mzTab importer into LIMS/ resource Different output files from different tools 4th European Lipidomic meeting Graz, 24 September 2014 Data converters to mzTab mzTab exporter from LIMS/ resource LipidXplorer LDA ALEX Others
  • 46. Acknowledgements Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 Johannes Griss Qing-Wei Xu Joe Foster R. Salek & C. Steinbeck COSMOS partners G. Liebisch, M. Troetzmueller, F. Spener, H. Koefeler & M. Wakelam http://code.google.com/p/mztab/ Jurgen Hartler Gerhard Thallinger BBSRC PROCESS grant Mathias Walzer Timo Sachsenberg Oliver Kohlbacher
  • 47. Juan A. Vizcaíno juan@ebi.ac.uk 4th European Lipidomic meeting Graz, 24 September 2014 Questions?