SlideShare a Scribd company logo
1 of 42
The force of computational mass
spectrometry awakens in the EBI
Juan Antonio Vizcaíno
Reza Salek
Ken Haug
HRF Talk
17 December 2015
• Very short intro to Mass Spectrometry
• PRIDE and MetaboLights
• ProteomeXchange and MetabolomeXchange
• Data standards
• OmicsDI
Overview
HRF Talk
17 December 2015
How Mass Spectrometry works?
1. ionisation 2. separation of m/z ions
3. detection
Mass spectrometry has been described as the smallest scale in the world, not because
of the mass spectrometer’s size but because of the size of what it weighs –
Gary Siuzdak- Head of the Scripps centre for metabolomics and mass spectrometry at La Jolla USA
Ion cannon
From Cheng Lu – Introduction to Mass Spectrometer, Figures from various sources (Liebler (introduction to
proteomics: tools for the new biology. Humana Press 2002, Scripps, whatisms),
HRF Talk
17 December 2015
MS proteomics: tandem MS (bottom-up)
MS/MS matching identifies
peptides, not proteins.
Proteins are inferred from the
peptide sequences.
HRF Talk
17 December 2015
Lasers are needed everywhere…
MALDI MS Laser swords
HRF Talk
17 December 2015
Data size of Mass Spectrometry data at the EBI (May 2015)
1.E+07
1.E+08
1.E+09
1.E+10
1.E+11
1.E+12
1.E+13
1.E+14
1.E+15
1.E+16
1.E+17
2004 2006 2008 2010 2012 2014 2016
bytes
date
Data accumulation by platform
sequence
array
MS
Chart generated by Guy Cochrane
HRF Talk
17 December 2015
• Very short intro to Mass Spectrometry
• PRIDE and MetaboLights
• ProteomeXchange and MetabolomeXchange
• Data standards and tools
• OmicsDI
Overview
HRF Talk
17 December 2015
• PRIDE Archive stores MS-based proteomics data:
• Peptide and protein expression data (identification & quantification)
• Post-translational modifications
• Mass spectra (raw data and peak lists)
• Technical and biological metadata
• Any other related information
• Focused in MS/MS approaches, but any type of proteomics workflows
can be stored.
• For each dataset PRIDE stores at least the raw data and the
processed results.
PRIDE (PRoteomics IDEntifications) Archive
http://www.ebi.ac.uk/pride Martens et al., Proteomics, 2005
Vizcaíno et al., NAR, 2016, in press
HRF Talk
17 December 2015
PRIDE: Source of MS proteomics data
• PRIDE Archive already provides MS
proteomics data to other EMBL-EBI
resources such as UniProt, Ensembl
and the Expression Atlas.
http://www.ebi.ac.uk/pride
HRF Talk
17 December 2015
Ways to access data in PRIDE Archive
• PRIDE web interface
• File repository
• REST web service
• PRIDE Inspector tool
HRF Talk
17 December 2015
PRIDE Archive submitted datasets up until 1st November, 2015
• 1,259 submitted datasets to PRIDE Archive by November 1st
• 923 were submitted datasets in 2014
• In the last 6 months, 155 submitted datasets per month
• Size: ~ 160 TB
HRF Talk
17 December 2015
PRIDE Tools: Submission Process
PRIDE Converter 2
PRIDE Inspector PX Submission Tool
mzIdentML
PRIDE XML
2
HRF Talk
17 December 2015
PRIDE Inspector Toolsuite: Visualisation tool
Wang et al., Nat. Biotechnology, 2012
Perez-Riverol et al., MCP, 2016, in press
PRIDE Inspector Toolsuite
PRIDE Inspector Toolsuite supports:
- PRIDE XML
- mzIdentML + all types of spectra files
- mzML
- mzTab identification and Quantification
+ all types of spectra files
https://github.com/PRIDE-Toolsuite/
HRF Talk
17 December 2015
PRIDE Inspector Toolsuite: Visualisation tool
HRF Talk
17 December 2015
PRIDE Inspector Toolsuite: Visualisation tool
HRF Talk
17 December 2015
PRIDE Tools: Submission Process
PRIDE Converter 2
PRIDE Inspector PX Submission Tool
mzIdentML
PRIDE XML
3
HRF Talk
17 December 2015
• It selects and captures the mappings between the different types of files included in the
submission.
• It transfers all the files using Aspera (default) or FTP.
PX submission tool
Results
Raw
Other
files
http://www.proteomexchange.org/submission
PX
submission
tool
• Version 2.3.0 released in August 2015 (Several refinements and improvements).
• Alternative command line method also available for groups with bioinformatics support.
HRF Talk
17 December 2015
MetaboLights – Logical components
• Data Submission
• ISAcreator
• Online data deposition
• Repository
• Complete metabolomics experiments
• Open data access
• Metabolite References
• Metabolite annotation
• Analysis
• Integrated data analysis
HRF Talk
17 December 2015 From http://www.isa-tools.org
HRF Talk
17 December 2015
MetaboLights – Submission Pipeline
Share private prepublication studies with
reviewers and other trusted parties.
Study upload
HRF Talk
17 December 2015
HRF Talk
17 December 2015
• Very short intro to Mass Spectrometry
• PRIDE and MetaboLights
• ProteomeXchange and MetabolomeXchange
• Data standards
• OmicsDI
Overview
HRF Talk
17 December 2015
ProteomeXchange Consortium
• Goal: Development of a framework to allow standard
data submission and dissemination pipelines
between the main existing proteomics repositories.
• Includes PeptideAtlas (ISB, Seattle), PRIDE
(Cambridge, UK) and (very recently) MassIVE (UCSD,
San Diego).
• Common identifier space (PXD identifiers)
• Two supported data workflows: MS/MS and SRM.
• Main objective: Make life easier for researchers
http://www.proteomexchange.org Vizcaíno et al., Nat Biotechnol, 2014
HRF Talk
17 December 2015
ProteomeCentral
Metadata /
Manuscript
Raw Data*
Results
Journals
UniProt/
neXtProt
Peptide Atlas
Other DBs
Receiving repositories
PASSEL
(SRM data)
PRIDE
(MS/MS data)
Other DBs
GPMDB
Researcher’s results
Reprocessed results
Raw data*
Metadata
MassIVE
(MS/MS data)
ProteomeXchange data workflow
HRF Talk
17 December 2015
ProteomeCentral: Portal for all PX datasets
http://proteomecentral.proteomexchange.org/cgi/GetDataset
HRF Talk
17 December 2015
ProteomeXchange: 2,774 datasets up until 1st September, 2015
Type:
1681 PRIDE partial
813 PRIDE complete
173 MassIVE
84 PeptideAtlas/PASSEL complete
23 Reprocessed
Publicly Accessible:
1372 datasets, 49% of all
90% PRIDE
6% PASSEL
4% MassIVE
Data volume:
Total: ~150 TB
Number of all files: ~400,000
PXD000320-324: ~ 4 TB
PXD002319-26 ~2.4 TB
PXD001471 ~1.6 TB
Datasets/year:
2012: 102
2013: 527
2014: 963
2015: 1182
Top Species studied by at least 20 datasets:
1080 Homo sapiens
335 Mus musculus
110 Saccharomyces cerevisiae
98 Arabidopsis thaliana
75 Rattus norvegicus
58 Escherichia coli
29 Bos taurus
23 Glycine max
20 Caenorhabditis elegans
20 Oryza sativa
~ 500 species in total
Origin:
714 USA
313 Germany
252 United Kingdom
163 China
146 France
121 Netherlands
108 Switzerland
103 Canada
81 Denmark
73 Spain
68 Japan
67 Australia
63 Sweden
57 Belgium
43 Austria
39 India
34 Taiwan
33 Norway
26 Italy
24 Ireland
24 Finland
21 Republic of Korea
20 Brazil
20 Russia
18 Israel
18 Singapore …
HRF Talk
17 December 2015
COSMOS - COordination Of Standards In
MetabOlomicS
HRF Talk
17 December 2015
MetabolomeXchange Consortium
• Global network for exchange of
metabolomics data
• Includes study as well as reference
data
HRF Talk
17 December 2015
HRF Talk
17 December 2015
• Very short intro to Mass Spectrometry
• PRIDE and MetaboLights
• ProteomeXchange and MetabolomeXchange
• Data standards
• OmicsDI
Overview
HRF Talk
17 December 2015
Current PSI Proteomics Standard File Formats for MS
• mzTabFinal Results
• TraMLSRM
• mzQuantMLQuantitation
• mzIdentMLIdentification
• mzMLMS data
HRF Talk
17 December 2015
Current Metabolomics Standard File Formats for MS
• mzTabFinal Results
• TraML *SRM
• mzQuantML *Quantitation
• mzIdentMLIdentification
• mzMLMS data
HRF Talk
17 December 2015
Data exchange standards in MS
Neumann (IPB-Halle), Proteomics and HUPO-PSI community
HRF Talk
17 December 2015
PRIDE Inspector Toolsuite: Visualisation tool
HRF Talk
17 December 2015
nmrTab
nmrTab
NMR data exchange standards
Neumann and D Schober (IPB-Halle, M Wilson and D Wishart (U Alberta Canada), L Figueiredo and R Salek (EMBL-EBI), D Jacob
and C Deborde (Centre INRA de Bordeaux) and P Rocca-Serra (University of Oxford e-Research Centre). T Ebbels (Imperial College),
C Ludwig, J Easton, (University of Birmingham), A Moing (Centre INRA de Bordeaux), L Tenori (University of Florence), A Rosato
(University of Florence), I Lewis (Princeton) and many more
HRF Talk
17 December 2015
NMR data management facilitation via nmrML
http://nmrml.org
HRF Talk
17 December 2015
• Very short intro to Mass Spectrometry
• PRIDE and MetaboLights
• ProteomeXchange and MetabolomeXchange
• Data standards and tools
• OmicsDI
Overview
HRF Talk
17 December 2015
OmicsDI: Portal for omics datasets
http://www.ebi.ac.uk/Tools/omicsdi/
• Aims to integrate of ‘omics’ datasets (genomics, proteomics and
metabolomics at present). Not only EBI resources are included.
PRIDE Archive
MassIVE
PASSEL
GPMDB
MetaboLights
Metabolomics Workbench
GNPS
EGA
HRF Talk
17 December 2015
(a) (b) (c)
(d) (e) (f)
OmicsDI: Functionality in the home page
HRF Talk
17 December 2015
Aknowledgements: People
Attila Csordas
Tobias Ternent
Noemi del Toro
Gerhard Mayer (Bochum, de.NBI)
Johannes Griss
Yasset Perez-Riverol
Henning Hermjakob
Former team members: Rui Wang,
Florian Reisinger and Jose A. Dianes
Other EBI teams involved in the
development of OmicsDI
Acknowledgements: The PRIDE Team
HRF Talk
17 December 2015
MetaboLights – The team
Previous: Pablo Conesa, Paula de Matos, Mark Rijnbeek, Tejasvi Mahendraker, Xinzhu
Wang (UC)
Kenneth Haug Reza Salek
Jose Ramon Macias Mark Williams
Kalai Jayaseelan Namrata Kale
Venkata Chandrasekhar
Christoph Steinbeck Jules Griffin (UC &
MRC)
Xuefei Li (MRC)
HRF Talk
17 December 2015
Questions?

More Related Content

What's hot

An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...Juan Antonio Vizcaino
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...Juan Antonio Vizcaino
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Juan Antonio Vizcaino
 
ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015Juan Antonio Vizcaino
 
PRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinarPRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinarJuan Antonio Vizcaino
 
Semantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISemantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISimon Jupp
 
Ontologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinOntologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinSimon Jupp
 
Publication of raw and curated NMR spectroscopic data for organic molecules
Publication of raw and curated NMR spectroscopic data for organic moleculesPublication of raw and curated NMR spectroscopic data for organic molecules
Publication of raw and curated NMR spectroscopic data for organic moleculesChristoph Steinbeck
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)Carole Goble
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...Carole Goble
 
UniProt-GOA
UniProt-GOAUniProt-GOA
UniProt-GOAEBI
 
schema.org and biomedical ontologies
schema.org and biomedical ontologies schema.org and biomedical ontologies
schema.org and biomedical ontologies Simon Jupp
 
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseMaking Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseJustin Clark-Casey
 

What's hot (20)

Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...
 
ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015
 
Pride cluster presentation
Pride cluster presentation Pride cluster presentation
Pride cluster presentation
 
PRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinarPRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinar
 
Semantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISemantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBI
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
Ontologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinOntologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlin
 
Publication of raw and curated NMR spectroscopic data for organic molecules
Publication of raw and curated NMR spectroscopic data for organic moleculesPublication of raw and curated NMR spectroscopic data for organic molecules
Publication of raw and curated NMR spectroscopic data for organic molecules
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
UniProt-GOA
UniProt-GOAUniProt-GOA
UniProt-GOA
 
schema.org and biomedical ontologies
schema.org and biomedical ontologies schema.org and biomedical ontologies
schema.org and biomedical ontologies
 
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseMaking Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
 
Better Data for a Better World
Better Data for a Better WorldBetter Data for a Better World
Better Data for a Better World
 
A chemistry data repository to serve them all
A chemistry data repository to serve them allA chemistry data repository to serve them all
A chemistry data repository to serve them all
 

Similar to Mass spectrometry resources at the EBI

ProteomeXchange: data deposition and data retrieval made easy
ProteomeXchange: data deposition and data retrieval made easyProteomeXchange: data deposition and data retrieval made easy
ProteomeXchange: data deposition and data retrieval made easyJuan Antonio Vizcaino
 
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...Juan Antonio Vizcaino
 
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusablePRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable Yasset Perez-Riverol
 
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)OpenAIRE
 
Mining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsMining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsJuan Antonio Vizcaino
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...BigData_Europe
 
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeData volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeJuan Antonio Vizcaino
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateJuan Antonio Vizcaino
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation Research Data Alliance
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation Research Data Alliance
 
Fairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesFairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesPistoia Alliance
 
Europe PubMed Central and Linked Data
Europe PubMed Central and Linked DataEurope PubMed Central and Linked Data
Europe PubMed Central and Linked DataJee-Hyub Kim
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataJuan Antonio Vizcaino
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 

Similar to Mass spectrometry resources at the EBI (20)

ProteomeXchange: data deposition and data retrieval made easy
ProteomeXchange: data deposition and data retrieval made easyProteomeXchange: data deposition and data retrieval made easy
ProteomeXchange: data deposition and data retrieval made easy
 
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
 
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusablePRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
 
ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
 
Human microbiome project
Human microbiome projectHuman microbiome project
Human microbiome project
 
ProteomeXchange update 2017
ProteomeXchange update 2017ProteomeXchange update 2017
ProteomeXchange update 2017
 
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
 
Pride and ProteomeXchange
Pride and ProteomeXchangePride and ProteomeXchange
Pride and ProteomeXchange
 
Mining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsMining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasets
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
 
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeData volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 update
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
Fairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesFairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matrices
 
Europe PubMed Central and Linked Data
Europe PubMed Central and Linked DataEurope PubMed Central and Linked Data
Europe PubMed Central and Linked Data
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
 
Data integration
Data integrationData integration
Data integration
 
Phd tesis olga giraldo 10mayo
Phd tesis olga giraldo 10mayoPhd tesis olga giraldo 10mayo
Phd tesis olga giraldo 10mayo
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 

More from Juan Antonio Vizcaino

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Juan Antonio Vizcaino
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formatsJuan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Juan Antonio Vizcaino
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...Juan Antonio Vizcaino
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Juan Antonio Vizcaino
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...Juan Antonio Vizcaino
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Juan Antonio Vizcaino
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Juan Antonio Vizcaino
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...Juan Antonio Vizcaino
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataJuan Antonio Vizcaino
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRJuan Antonio Vizcaino
 

More from Juan Antonio Vizcaino (20)

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formats
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
PRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchangePRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchange
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
 
PSI-Proteome Informatics update
PSI-Proteome Informatics updatePSI-Proteome Informatics update
PSI-Proteome Informatics update
 
ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
 
The ELIXIR Proteomics community
The ELIXIR Proteomics community The ELIXIR Proteomics community
The ELIXIR Proteomics community
 
The ELIXIR Proteomics Community
The ELIXIR Proteomics CommunityThe ELIXIR Proteomics Community
The ELIXIR Proteomics Community
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics data
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIR
 

Recently uploaded

Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett SquareIsiahStephanRadaza
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
The Black hole shadow in Modified Gravity
The Black hole shadow in Modified GravityThe Black hole shadow in Modified Gravity
The Black hole shadow in Modified GravitySubhadipsau21168
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 

Recently uploaded (20)

Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett Square
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
The Black hole shadow in Modified Gravity
The Black hole shadow in Modified GravityThe Black hole shadow in Modified Gravity
The Black hole shadow in Modified Gravity
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 

Mass spectrometry resources at the EBI

  • 1. The force of computational mass spectrometry awakens in the EBI Juan Antonio Vizcaíno Reza Salek Ken Haug
  • 2. HRF Talk 17 December 2015 • Very short intro to Mass Spectrometry • PRIDE and MetaboLights • ProteomeXchange and MetabolomeXchange • Data standards • OmicsDI Overview
  • 3. HRF Talk 17 December 2015 How Mass Spectrometry works? 1. ionisation 2. separation of m/z ions 3. detection Mass spectrometry has been described as the smallest scale in the world, not because of the mass spectrometer’s size but because of the size of what it weighs – Gary Siuzdak- Head of the Scripps centre for metabolomics and mass spectrometry at La Jolla USA Ion cannon From Cheng Lu – Introduction to Mass Spectrometer, Figures from various sources (Liebler (introduction to proteomics: tools for the new biology. Humana Press 2002, Scripps, whatisms),
  • 4. HRF Talk 17 December 2015 MS proteomics: tandem MS (bottom-up) MS/MS matching identifies peptides, not proteins. Proteins are inferred from the peptide sequences.
  • 5. HRF Talk 17 December 2015 Lasers are needed everywhere… MALDI MS Laser swords
  • 6. HRF Talk 17 December 2015 Data size of Mass Spectrometry data at the EBI (May 2015) 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13 1.E+14 1.E+15 1.E+16 1.E+17 2004 2006 2008 2010 2012 2014 2016 bytes date Data accumulation by platform sequence array MS Chart generated by Guy Cochrane
  • 7. HRF Talk 17 December 2015 • Very short intro to Mass Spectrometry • PRIDE and MetaboLights • ProteomeXchange and MetabolomeXchange • Data standards and tools • OmicsDI Overview
  • 8. HRF Talk 17 December 2015 • PRIDE Archive stores MS-based proteomics data: • Peptide and protein expression data (identification & quantification) • Post-translational modifications • Mass spectra (raw data and peak lists) • Technical and biological metadata • Any other related information • Focused in MS/MS approaches, but any type of proteomics workflows can be stored. • For each dataset PRIDE stores at least the raw data and the processed results. PRIDE (PRoteomics IDEntifications) Archive http://www.ebi.ac.uk/pride Martens et al., Proteomics, 2005 Vizcaíno et al., NAR, 2016, in press
  • 9. HRF Talk 17 December 2015 PRIDE: Source of MS proteomics data • PRIDE Archive already provides MS proteomics data to other EMBL-EBI resources such as UniProt, Ensembl and the Expression Atlas. http://www.ebi.ac.uk/pride
  • 10. HRF Talk 17 December 2015 Ways to access data in PRIDE Archive • PRIDE web interface • File repository • REST web service • PRIDE Inspector tool
  • 11. HRF Talk 17 December 2015 PRIDE Archive submitted datasets up until 1st November, 2015 • 1,259 submitted datasets to PRIDE Archive by November 1st • 923 were submitted datasets in 2014 • In the last 6 months, 155 submitted datasets per month • Size: ~ 160 TB
  • 12. HRF Talk 17 December 2015 PRIDE Tools: Submission Process PRIDE Converter 2 PRIDE Inspector PX Submission Tool mzIdentML PRIDE XML 2
  • 13. HRF Talk 17 December 2015 PRIDE Inspector Toolsuite: Visualisation tool Wang et al., Nat. Biotechnology, 2012 Perez-Riverol et al., MCP, 2016, in press PRIDE Inspector Toolsuite PRIDE Inspector Toolsuite supports: - PRIDE XML - mzIdentML + all types of spectra files - mzML - mzTab identification and Quantification + all types of spectra files https://github.com/PRIDE-Toolsuite/
  • 14. HRF Talk 17 December 2015 PRIDE Inspector Toolsuite: Visualisation tool
  • 15. HRF Talk 17 December 2015 PRIDE Inspector Toolsuite: Visualisation tool
  • 16. HRF Talk 17 December 2015 PRIDE Tools: Submission Process PRIDE Converter 2 PRIDE Inspector PX Submission Tool mzIdentML PRIDE XML 3
  • 17. HRF Talk 17 December 2015 • It selects and captures the mappings between the different types of files included in the submission. • It transfers all the files using Aspera (default) or FTP. PX submission tool Results Raw Other files http://www.proteomexchange.org/submission PX submission tool • Version 2.3.0 released in August 2015 (Several refinements and improvements). • Alternative command line method also available for groups with bioinformatics support.
  • 18. HRF Talk 17 December 2015 MetaboLights – Logical components • Data Submission • ISAcreator • Online data deposition • Repository • Complete metabolomics experiments • Open data access • Metabolite References • Metabolite annotation • Analysis • Integrated data analysis
  • 19. HRF Talk 17 December 2015 From http://www.isa-tools.org
  • 20. HRF Talk 17 December 2015 MetaboLights – Submission Pipeline Share private prepublication studies with reviewers and other trusted parties. Study upload
  • 22. HRF Talk 17 December 2015 • Very short intro to Mass Spectrometry • PRIDE and MetaboLights • ProteomeXchange and MetabolomeXchange • Data standards • OmicsDI Overview
  • 23. HRF Talk 17 December 2015 ProteomeXchange Consortium • Goal: Development of a framework to allow standard data submission and dissemination pipelines between the main existing proteomics repositories. • Includes PeptideAtlas (ISB, Seattle), PRIDE (Cambridge, UK) and (very recently) MassIVE (UCSD, San Diego). • Common identifier space (PXD identifiers) • Two supported data workflows: MS/MS and SRM. • Main objective: Make life easier for researchers http://www.proteomexchange.org Vizcaíno et al., Nat Biotechnol, 2014
  • 24. HRF Talk 17 December 2015 ProteomeCentral Metadata / Manuscript Raw Data* Results Journals UniProt/ neXtProt Peptide Atlas Other DBs Receiving repositories PASSEL (SRM data) PRIDE (MS/MS data) Other DBs GPMDB Researcher’s results Reprocessed results Raw data* Metadata MassIVE (MS/MS data) ProteomeXchange data workflow
  • 25. HRF Talk 17 December 2015 ProteomeCentral: Portal for all PX datasets http://proteomecentral.proteomexchange.org/cgi/GetDataset
  • 26. HRF Talk 17 December 2015 ProteomeXchange: 2,774 datasets up until 1st September, 2015 Type: 1681 PRIDE partial 813 PRIDE complete 173 MassIVE 84 PeptideAtlas/PASSEL complete 23 Reprocessed Publicly Accessible: 1372 datasets, 49% of all 90% PRIDE 6% PASSEL 4% MassIVE Data volume: Total: ~150 TB Number of all files: ~400,000 PXD000320-324: ~ 4 TB PXD002319-26 ~2.4 TB PXD001471 ~1.6 TB Datasets/year: 2012: 102 2013: 527 2014: 963 2015: 1182 Top Species studied by at least 20 datasets: 1080 Homo sapiens 335 Mus musculus 110 Saccharomyces cerevisiae 98 Arabidopsis thaliana 75 Rattus norvegicus 58 Escherichia coli 29 Bos taurus 23 Glycine max 20 Caenorhabditis elegans 20 Oryza sativa ~ 500 species in total Origin: 714 USA 313 Germany 252 United Kingdom 163 China 146 France 121 Netherlands 108 Switzerland 103 Canada 81 Denmark 73 Spain 68 Japan 67 Australia 63 Sweden 57 Belgium 43 Austria 39 India 34 Taiwan 33 Norway 26 Italy 24 Ireland 24 Finland 21 Republic of Korea 20 Brazil 20 Russia 18 Israel 18 Singapore …
  • 27. HRF Talk 17 December 2015 COSMOS - COordination Of Standards In MetabOlomicS
  • 28. HRF Talk 17 December 2015 MetabolomeXchange Consortium • Global network for exchange of metabolomics data • Includes study as well as reference data
  • 30. HRF Talk 17 December 2015 • Very short intro to Mass Spectrometry • PRIDE and MetaboLights • ProteomeXchange and MetabolomeXchange • Data standards • OmicsDI Overview
  • 31. HRF Talk 17 December 2015 Current PSI Proteomics Standard File Formats for MS • mzTabFinal Results • TraMLSRM • mzQuantMLQuantitation • mzIdentMLIdentification • mzMLMS data
  • 32. HRF Talk 17 December 2015 Current Metabolomics Standard File Formats for MS • mzTabFinal Results • TraML *SRM • mzQuantML *Quantitation • mzIdentMLIdentification • mzMLMS data
  • 33. HRF Talk 17 December 2015 Data exchange standards in MS Neumann (IPB-Halle), Proteomics and HUPO-PSI community
  • 34. HRF Talk 17 December 2015 PRIDE Inspector Toolsuite: Visualisation tool
  • 35. HRF Talk 17 December 2015 nmrTab nmrTab NMR data exchange standards Neumann and D Schober (IPB-Halle, M Wilson and D Wishart (U Alberta Canada), L Figueiredo and R Salek (EMBL-EBI), D Jacob and C Deborde (Centre INRA de Bordeaux) and P Rocca-Serra (University of Oxford e-Research Centre). T Ebbels (Imperial College), C Ludwig, J Easton, (University of Birmingham), A Moing (Centre INRA de Bordeaux), L Tenori (University of Florence), A Rosato (University of Florence), I Lewis (Princeton) and many more
  • 36. HRF Talk 17 December 2015 NMR data management facilitation via nmrML http://nmrml.org
  • 37. HRF Talk 17 December 2015 • Very short intro to Mass Spectrometry • PRIDE and MetaboLights • ProteomeXchange and MetabolomeXchange • Data standards and tools • OmicsDI Overview
  • 38. HRF Talk 17 December 2015 OmicsDI: Portal for omics datasets http://www.ebi.ac.uk/Tools/omicsdi/ • Aims to integrate of ‘omics’ datasets (genomics, proteomics and metabolomics at present). Not only EBI resources are included. PRIDE Archive MassIVE PASSEL GPMDB MetaboLights Metabolomics Workbench GNPS EGA
  • 39. HRF Talk 17 December 2015 (a) (b) (c) (d) (e) (f) OmicsDI: Functionality in the home page
  • 40. HRF Talk 17 December 2015 Aknowledgements: People Attila Csordas Tobias Ternent Noemi del Toro Gerhard Mayer (Bochum, de.NBI) Johannes Griss Yasset Perez-Riverol Henning Hermjakob Former team members: Rui Wang, Florian Reisinger and Jose A. Dianes Other EBI teams involved in the development of OmicsDI Acknowledgements: The PRIDE Team
  • 41. HRF Talk 17 December 2015 MetaboLights – The team Previous: Pablo Conesa, Paula de Matos, Mark Rijnbeek, Tejasvi Mahendraker, Xinzhu Wang (UC) Kenneth Haug Reza Salek Jose Ramon Macias Mark Williams Kalai Jayaseelan Namrata Kale Venkata Chandrasekhar Christoph Steinbeck Jules Griffin (UC & MRC) Xuefei Li (MRC)
  • 42. HRF Talk 17 December 2015 Questions?