Status of ProteomeXchange in 2017
Dr. Juan Antonio Vizcaíno
EMBL-EBI
Hinxton, Cambridge, UK
PSI meeting 2017
Beijing, 24 April 2017
Overview
• Introduction and status in 2017
• Submission, download and citation statistics
• New prospective member: Firmiana
• OmicsDI interface
PSI meeting 2017
Beijing, 24 April 2017
PRIDE is leading the global ProteomeXchange Consortium
PASSEL
(SRM data)
PRIDE
(MS/MS data)
MassIVE
(MS/MS data)
Raw
ID/Q
Meta
jPOST
(MS/MS data)
Mandatory raw data deposition
since July 2015
Goal: Development of a framework to allow standard data submission and
dissemination pipelines between the main existing proteomics repositories.
http://www.proteomexchange.org
New in 2016
Vizcaíno et al., Nat Biotechnol, 2014
Deustch et al., NAR, 2017
PSI meeting 2017
Beijing, 24 April 2017
ProteomeCentral
Metadata /
Manuscript
Raw Data
Results
Journals
Peptide Atlas
Receiving repositories
PRIDE
Researcher’s results
Raw data
Metadata
PASSEL
Research
groups
Reanalysis of datasets
MassIVE
jPOST
MS/MS
data
(as complete
submissions)
Any other
workflow
(mainly partial
submissions)
DATASETS
SRM
data
Reprocessed results
MassIVE
ProteomeXchange data workflow
Vizcaíno et al., Nat Biotechnol, 2014
Deutsch et al., NAR, 2017
PSI meeting 2017
Beijing, 24 April 2017
ProteomeCentral
Metadata /
Manuscript
Raw Data
Results
Journals
UniProt/
neXtProtPeptide Atlas
Other DBs
Receiving repositories
PRIDE
GPMDBResearcher’s results
Raw data
Metadata
PASSEL
proteomicsDB
Research
groups
Reanalysis of datasets
MassIVE
jPOST
MS/MS
data
(as complete
submissions)
Any other
workflow
(mainly partial
submissions)
DATASETS
OmicsDI
Integration with other
omics datasets
SRM
data
Reprocessed results
MassIVE
ProteomeXchange data workflow
PSI meeting 2017
Beijing, 24 April 2017
NAR update paper published
PSI meeting 2017
Beijing, 24 April 2017
HPP guidelines version 2.1
PSI meeting 2017
Beijing, 24 April 2017
Overview
• Introduction and status in 2017
• Submission, download and citation statistics
• New prospective member: Firmiana
• OmicsDI interface
PSI meeting 2017
Beijing, 24 April 2017
Countries with at least 100
datasets:
1105 USA
546 Germany
411 United Kingdom
356 China
229 France
188 Netherlands
178 Canada
150 Switzerland
125 Australia
123 Spain
123 Denmark
117 Japan
101 Sweden
Publicly Accessible:
2597 datasets, 57% of all
2334 PRIDE
135 MassIVE
115 PASSEL
13 jPOST
Datasets/year:
2012: 102
2013: 527
2014: 963
2015: 1758
2016 (till July): 1184
Top Species studied by at least 100
datasets:
2010 Homo sapiens
604 Mus musculus
191 Saccharomyces cerevisiae
140 Arabidopsis thaliana
127 Rattus norvegicus
936 reported taxa in total
ProteomeXchange: 4,534 datasets up until end of July 2016
PSI meeting 2017
Beijing, 24 April 2017
Countries with at least 100
submitted datasets :
1019 USA
734 Germany
492 United Kingdom
470 China
273 France
209 Netherlands
173 Canada
165 Switzerland
157 Australia
148 Austria
142 Denmark
137 Spain
115 Sweden
109 Japan
100 India
5,198 ProteomeXchange datasets in PRIDE (by March 2017)
Type:
3835 ‘Partial’ submissions (73.8%)
1363 ‘Complete’ submissions (26.2%)
Released: 3462 datasets (66.6%)
Unpublished: 1,736 datasets (33.4%)
Data volume in PRIDE:
Total: ~320 TB
Number of files: ~670,000
PXD000320-324: ~ 4 TB
PXD002319-26 ~2.4 TB
PXD001471 ~1.6 TB
Top Species represented (at least 100
datasets):
2267 Homo sapiens
765 Mus musculus
201 Saccharomyces cerevisiae
169 Arabidopsis thaliana
154 Rattus norvegicus
124Escherichia coli
~ 1000 species in total
1,940 datasets submitted to PRIDE in 2016
PSI meeting 2017
Beijing, 24 April 2017
More PRIDE-centric stats….
PSI meeting 2017
Beijing, 24 April 2017
Citations statistics
> 400 citations in one year
PSI meeting 2017
Beijing, 24 April 2017
Citations are increasing
Naik, Nature, 9 Nov 2016
PSI meeting 2017
Beijing, 24 April 2017
Public proteomics datasets are being increasingly
reused…
Martens & Vizcaíno, Trends Bioch Sci, 2017
PSI meeting 2017
Beijing, 24 April 2017
Download volumes are increasing as well
Data download volume for
PRIDE Archive in 2016: 243 TB
0
50
100
150
200
250
300
2013 2014 2015 2016
Downloads in TBs
PSI meeting 2017
Beijing, 24 April 2017
Overview
• Introduction and status in 2017
• Submission, download and citation statistics
• New prospective member: Firmiana
• OmicsDI interface
PSI meeting 2017
Beijing, 24 April 2017
Firmiana to join PX?
http://www.firmiana.org/
PSI meeting 2017
Beijing, 24 April 2017
Funding and status
• BBSRC Partnering grants with China and Japan obtained in 2016.
• Proposal submitted to NIH on November 2016 (Call devoted to
data standards):
• Development of ProXI.
• Scoring received, waiting for the final decision.
• No further contacts with other proteomics resources.
PSI meeting 2017
Beijing, 24 April 2017
Overview
• Introduction and status
• Submission and citation statistics
• New prospective members: jPOST and iPROX
• OmicsDI interface
PSI meeting 2017
Beijing, 24 April 2017
ProteomeCentral: Portal for all PX datasets
http://proteomecentral.proteomexchange.org/cgi/GetDataset
PSI meeting 2017
Beijing, 24 April 2017
ProteomeCentral: Portal for all PX datasets
http://proteomecentral.proteomexchange.org/cgi/GetDataset
PSI meeting 2017
Beijing, 24 April 2017
Public datasets from different omics: OmicsDI
http://www.ebi.ac.uk/Tools/omicsdi/
• Aims to integrate of ‘omics’ datasets (proteomics,
transcriptomics, metabolomics and genomics at present).
PRIDE
MassIVE
jPOST
PASSEL
GPMDB
ArrayExpress
Expression Atlas
MetaboLights
Metabolomics Workbench
GNPS
EGA
Perez-Riverol et al., Nat Biotechnol, in press
PSI meeting 2017
Beijing, 24 April 2017
OmicsDI: Portal for omics datasets
PSI meeting 2017
Beijing, 24 April 2017
OmicsDI: Portal for omics datasets
PSI meeting 2017
Beijing, 24 April 2017
Aknowledgements: People
Attila Csordas
Tobias Ternent
Gerhard Mayer (de.NBI)
Yasset Perez-Riverol
Manuel Bernal-Llinares
Andrew Jarnuczak
Mathias Walzer
Former team members, especially:
Rui Wang
Florian Reisinger
Noemi del Toro
Jose A. Dianes
Henning Hermjakob
Acknowledgements: The PRIDE Team and all PX partners
All data submitters !!!
Eric Deutsch
Zhi Sun
David Campbell
Nuno Bandeira
Mingxun Wang
Jeremy Carver
Yasushi Ishihama
Shujiro Okuda
Shin Kawano
Follow new datasets @proteomexchange

ProteomeXchange update 2017

  • 1.
    Status of ProteomeXchangein 2017 Dr. Juan Antonio Vizcaíno EMBL-EBI Hinxton, Cambridge, UK
  • 2.
    PSI meeting 2017 Beijing,24 April 2017 Overview • Introduction and status in 2017 • Submission, download and citation statistics • New prospective member: Firmiana • OmicsDI interface
  • 3.
    PSI meeting 2017 Beijing,24 April 2017 PRIDE is leading the global ProteomeXchange Consortium PASSEL (SRM data) PRIDE (MS/MS data) MassIVE (MS/MS data) Raw ID/Q Meta jPOST (MS/MS data) Mandatory raw data deposition since July 2015 Goal: Development of a framework to allow standard data submission and dissemination pipelines between the main existing proteomics repositories. http://www.proteomexchange.org New in 2016 Vizcaíno et al., Nat Biotechnol, 2014 Deustch et al., NAR, 2017
  • 4.
    PSI meeting 2017 Beijing,24 April 2017 ProteomeCentral Metadata / Manuscript Raw Data Results Journals Peptide Atlas Receiving repositories PRIDE Researcher’s results Raw data Metadata PASSEL Research groups Reanalysis of datasets MassIVE jPOST MS/MS data (as complete submissions) Any other workflow (mainly partial submissions) DATASETS SRM data Reprocessed results MassIVE ProteomeXchange data workflow Vizcaíno et al., Nat Biotechnol, 2014 Deutsch et al., NAR, 2017
  • 5.
    PSI meeting 2017 Beijing,24 April 2017 ProteomeCentral Metadata / Manuscript Raw Data Results Journals UniProt/ neXtProtPeptide Atlas Other DBs Receiving repositories PRIDE GPMDBResearcher’s results Raw data Metadata PASSEL proteomicsDB Research groups Reanalysis of datasets MassIVE jPOST MS/MS data (as complete submissions) Any other workflow (mainly partial submissions) DATASETS OmicsDI Integration with other omics datasets SRM data Reprocessed results MassIVE ProteomeXchange data workflow
  • 6.
    PSI meeting 2017 Beijing,24 April 2017 NAR update paper published
  • 7.
    PSI meeting 2017 Beijing,24 April 2017 HPP guidelines version 2.1
  • 8.
    PSI meeting 2017 Beijing,24 April 2017 Overview • Introduction and status in 2017 • Submission, download and citation statistics • New prospective member: Firmiana • OmicsDI interface
  • 9.
    PSI meeting 2017 Beijing,24 April 2017 Countries with at least 100 datasets: 1105 USA 546 Germany 411 United Kingdom 356 China 229 France 188 Netherlands 178 Canada 150 Switzerland 125 Australia 123 Spain 123 Denmark 117 Japan 101 Sweden Publicly Accessible: 2597 datasets, 57% of all 2334 PRIDE 135 MassIVE 115 PASSEL 13 jPOST Datasets/year: 2012: 102 2013: 527 2014: 963 2015: 1758 2016 (till July): 1184 Top Species studied by at least 100 datasets: 2010 Homo sapiens 604 Mus musculus 191 Saccharomyces cerevisiae 140 Arabidopsis thaliana 127 Rattus norvegicus 936 reported taxa in total ProteomeXchange: 4,534 datasets up until end of July 2016
  • 10.
    PSI meeting 2017 Beijing,24 April 2017 Countries with at least 100 submitted datasets : 1019 USA 734 Germany 492 United Kingdom 470 China 273 France 209 Netherlands 173 Canada 165 Switzerland 157 Australia 148 Austria 142 Denmark 137 Spain 115 Sweden 109 Japan 100 India 5,198 ProteomeXchange datasets in PRIDE (by March 2017) Type: 3835 ‘Partial’ submissions (73.8%) 1363 ‘Complete’ submissions (26.2%) Released: 3462 datasets (66.6%) Unpublished: 1,736 datasets (33.4%) Data volume in PRIDE: Total: ~320 TB Number of files: ~670,000 PXD000320-324: ~ 4 TB PXD002319-26 ~2.4 TB PXD001471 ~1.6 TB Top Species represented (at least 100 datasets): 2267 Homo sapiens 765 Mus musculus 201 Saccharomyces cerevisiae 169 Arabidopsis thaliana 154 Rattus norvegicus 124Escherichia coli ~ 1000 species in total 1,940 datasets submitted to PRIDE in 2016
  • 11.
    PSI meeting 2017 Beijing,24 April 2017 More PRIDE-centric stats….
  • 12.
    PSI meeting 2017 Beijing,24 April 2017 Citations statistics > 400 citations in one year
  • 13.
    PSI meeting 2017 Beijing,24 April 2017 Citations are increasing Naik, Nature, 9 Nov 2016
  • 14.
    PSI meeting 2017 Beijing,24 April 2017 Public proteomics datasets are being increasingly reused… Martens & Vizcaíno, Trends Bioch Sci, 2017
  • 15.
    PSI meeting 2017 Beijing,24 April 2017 Download volumes are increasing as well Data download volume for PRIDE Archive in 2016: 243 TB 0 50 100 150 200 250 300 2013 2014 2015 2016 Downloads in TBs
  • 16.
    PSI meeting 2017 Beijing,24 April 2017 Overview • Introduction and status in 2017 • Submission, download and citation statistics • New prospective member: Firmiana • OmicsDI interface
  • 17.
    PSI meeting 2017 Beijing,24 April 2017 Firmiana to join PX? http://www.firmiana.org/
  • 18.
    PSI meeting 2017 Beijing,24 April 2017 Funding and status • BBSRC Partnering grants with China and Japan obtained in 2016. • Proposal submitted to NIH on November 2016 (Call devoted to data standards): • Development of ProXI. • Scoring received, waiting for the final decision. • No further contacts with other proteomics resources.
  • 19.
    PSI meeting 2017 Beijing,24 April 2017 Overview • Introduction and status • Submission and citation statistics • New prospective members: jPOST and iPROX • OmicsDI interface
  • 20.
    PSI meeting 2017 Beijing,24 April 2017 ProteomeCentral: Portal for all PX datasets http://proteomecentral.proteomexchange.org/cgi/GetDataset
  • 21.
    PSI meeting 2017 Beijing,24 April 2017 ProteomeCentral: Portal for all PX datasets http://proteomecentral.proteomexchange.org/cgi/GetDataset
  • 22.
    PSI meeting 2017 Beijing,24 April 2017 Public datasets from different omics: OmicsDI http://www.ebi.ac.uk/Tools/omicsdi/ • Aims to integrate of ‘omics’ datasets (proteomics, transcriptomics, metabolomics and genomics at present). PRIDE MassIVE jPOST PASSEL GPMDB ArrayExpress Expression Atlas MetaboLights Metabolomics Workbench GNPS EGA Perez-Riverol et al., Nat Biotechnol, in press
  • 23.
    PSI meeting 2017 Beijing,24 April 2017 OmicsDI: Portal for omics datasets
  • 24.
    PSI meeting 2017 Beijing,24 April 2017 OmicsDI: Portal for omics datasets
  • 25.
    PSI meeting 2017 Beijing,24 April 2017 Aknowledgements: People Attila Csordas Tobias Ternent Gerhard Mayer (de.NBI) Yasset Perez-Riverol Manuel Bernal-Llinares Andrew Jarnuczak Mathias Walzer Former team members, especially: Rui Wang Florian Reisinger Noemi del Toro Jose A. Dianes Henning Hermjakob Acknowledgements: The PRIDE Team and all PX partners All data submitters !!! Eric Deutsch Zhi Sun David Campbell Nuno Bandeira Mingxun Wang Jeremy Carver Yasushi Ishihama Shujiro Okuda Shin Kawano Follow new datasets @proteomexchange