ProteomeXchange
Dr. Juan Antonio Vizcaíno
EMBL-EBI
Hinxton, Cambridge, UK
ASMS Workshop
San Antonio, 8 June 2016
ProteomeXchange Consortium
• Goal: Development of a framework to allow standard
data submission and dissemination pipelines
between the main existing proteomics repositories.
• Includes PeptideAtlas (ISB, Seattle), PRIDE
(Cambridge, UK) and (very recently) MassIVE (UCSD,
San Diego).
• Common identifier space (PXD identifiers)
• Two supported data workflows: MS/MS and SRM.
• Data from all approaches can be submitted.
• Main objective: Make life easier for researchers
http://www.proteomexchange.org Vizcaíno et al., Nat Biotechnol, 2014
ASMS Workshop
San Antonio, 8 June 2016
ProteomeCentral
Metadata /
Manuscript
Raw Data*
Results
Journals
UniProt/
neXtProt
Peptide Atlas
Other DBs
Receiving repositories
PASSEL
(SRM data)
PRIDE
(MS/MS data)
Other DBs
GPMDB
Researcher’s results
Reprocessed results
Raw data*
Metadata
MassIVE
(MS/MS data)
ProteomeXchange data workflow
ASMS Workshop
San Antonio, 8 June 2016
Origin:
885 USA
465 Germany
342 United Kingdom
264 China
194 France
158 Netherland
136 Canada
126 Switzerland
107 Denmark
104 Spain
99 Australia
95 Japan
72 Belgium
68 Austria
63 Sweden
61 India
51 Norway
43 Taiwan
30 Italy
29 Brazil
28 Singapore
28 Finland
27 Ireland
27 Russia
26 Israel …
ProteomeXchange: 3,802 datasets up until 1st April, 2016
Type:
2429 PRIDE partial
1016 PRIDE complete
250 MassIVE
84 PeptideAtlas/PASSEL complete
23 Reprocessed
Publicly Accessible:
1973 datasets, 52% of all
91% PRIDE
5% MassIVE
4% PASSEL
Datasets/year:
2012: 102
2013: 527
2014: 963
2015: 1758
2016: 452
Top Species studied by at least 20 datasets:
1526 Homo sapiens
485 Mus musculus
150 Saccharomyces cerevisiae
121 Arabidopsis thaliana
102 Rattus norvegicus
86 Escherichia coli
44 Bos taurus
35 Drosophila melanogaster
32 Glycine max
~ 700 species in total
ASMS Workshop
San Antonio, 8 June 2016
PRIDE Archive submitted datasets up until 1st April, 2016
• In the last year: ~150 submitted datasets per month
• Size: ~ 220TB
ASMS Workshop
San Antonio, 8 June 2016
ProteomeCentral: Portal for all PX datasets
http://proteomecentral.proteomexchange.org/cgi/GetDataset
ASMS Workshop
San Antonio, 8 June 2016
OmicsDI: Portal for omics datasets
http://www.ebi.ac.uk/Tools/omicsdi/
• Aims to integrate of ‘omics’ datasets (genomics, proteomics and
metabolomics at present). Not only EBI resources are included.
PRIDE Archive
MassIVE
PASSEL
GPMDB
MetaboLights
Metabolomics Workbench
GNPS
EGA
ASMS Workshop
San Antonio, 8 June 2016
Datasets are being downloaded….
Data download volume in 2015: ~ 200 TB
ASMS Workshop
San Antonio, 8 June 2016
Data reuse is increasing
Vaudel et al., Proteomics, 2016
ASMS Workshop
San Antonio, 8 June 2016
Aknowledgements: People
PRIDE team at EMBL-EBI
Attila Csordas
Tobias Ternent
Noemi del Toro
Gerhard Mayer (Bochum, de.NBI)
Johannes Griss
Yasset Perez-Riverol
Henning Hermjakob
Former team members: Rui Wang,
Florian Reisinger & Jose A. Dianes
Acknowledgements
PX partners
Eric Deutsch and his team at ISB
Nuno Bandeira and his team at UCSD
Twitter: @proteomexchange

Proteomexchange

  • 1.
    ProteomeXchange Dr. Juan AntonioVizcaíno EMBL-EBI Hinxton, Cambridge, UK
  • 2.
    ASMS Workshop San Antonio,8 June 2016 ProteomeXchange Consortium • Goal: Development of a framework to allow standard data submission and dissemination pipelines between the main existing proteomics repositories. • Includes PeptideAtlas (ISB, Seattle), PRIDE (Cambridge, UK) and (very recently) MassIVE (UCSD, San Diego). • Common identifier space (PXD identifiers) • Two supported data workflows: MS/MS and SRM. • Data from all approaches can be submitted. • Main objective: Make life easier for researchers http://www.proteomexchange.org Vizcaíno et al., Nat Biotechnol, 2014
  • 3.
    ASMS Workshop San Antonio,8 June 2016 ProteomeCentral Metadata / Manuscript Raw Data* Results Journals UniProt/ neXtProt Peptide Atlas Other DBs Receiving repositories PASSEL (SRM data) PRIDE (MS/MS data) Other DBs GPMDB Researcher’s results Reprocessed results Raw data* Metadata MassIVE (MS/MS data) ProteomeXchange data workflow
  • 4.
    ASMS Workshop San Antonio,8 June 2016 Origin: 885 USA 465 Germany 342 United Kingdom 264 China 194 France 158 Netherland 136 Canada 126 Switzerland 107 Denmark 104 Spain 99 Australia 95 Japan 72 Belgium 68 Austria 63 Sweden 61 India 51 Norway 43 Taiwan 30 Italy 29 Brazil 28 Singapore 28 Finland 27 Ireland 27 Russia 26 Israel … ProteomeXchange: 3,802 datasets up until 1st April, 2016 Type: 2429 PRIDE partial 1016 PRIDE complete 250 MassIVE 84 PeptideAtlas/PASSEL complete 23 Reprocessed Publicly Accessible: 1973 datasets, 52% of all 91% PRIDE 5% MassIVE 4% PASSEL Datasets/year: 2012: 102 2013: 527 2014: 963 2015: 1758 2016: 452 Top Species studied by at least 20 datasets: 1526 Homo sapiens 485 Mus musculus 150 Saccharomyces cerevisiae 121 Arabidopsis thaliana 102 Rattus norvegicus 86 Escherichia coli 44 Bos taurus 35 Drosophila melanogaster 32 Glycine max ~ 700 species in total
  • 5.
    ASMS Workshop San Antonio,8 June 2016 PRIDE Archive submitted datasets up until 1st April, 2016 • In the last year: ~150 submitted datasets per month • Size: ~ 220TB
  • 6.
    ASMS Workshop San Antonio,8 June 2016 ProteomeCentral: Portal for all PX datasets http://proteomecentral.proteomexchange.org/cgi/GetDataset
  • 7.
    ASMS Workshop San Antonio,8 June 2016 OmicsDI: Portal for omics datasets http://www.ebi.ac.uk/Tools/omicsdi/ • Aims to integrate of ‘omics’ datasets (genomics, proteomics and metabolomics at present). Not only EBI resources are included. PRIDE Archive MassIVE PASSEL GPMDB MetaboLights Metabolomics Workbench GNPS EGA
  • 8.
    ASMS Workshop San Antonio,8 June 2016 Datasets are being downloaded…. Data download volume in 2015: ~ 200 TB
  • 9.
    ASMS Workshop San Antonio,8 June 2016 Data reuse is increasing Vaudel et al., Proteomics, 2016
  • 10.
    ASMS Workshop San Antonio,8 June 2016 Aknowledgements: People PRIDE team at EMBL-EBI Attila Csordas Tobias Ternent Noemi del Toro Gerhard Mayer (Bochum, de.NBI) Johannes Griss Yasset Perez-Riverol Henning Hermjakob Former team members: Rui Wang, Florian Reisinger & Jose A. Dianes Acknowledgements PX partners Eric Deutsch and his team at ISB Nuno Bandeira and his team at UCSD Twitter: @proteomexchange