2. PSI meeting 2017
Beijing, 24 April 2017
Overview
• Introduction and status in 2017
• Submission, download and citation statistics
• New prospective member: Firmiana
• OmicsDI interface
3. PSI meeting 2017
Beijing, 24 April 2017
PRIDE is leading the global ProteomeXchange Consortium
PASSEL
(SRM data)
PRIDE
(MS/MS data)
MassIVE
(MS/MS data)
Raw
ID/Q
Meta
jPOST
(MS/MS data)
Mandatory raw data deposition
since July 2015
Goal: Development of a framework to allow standard data submission and
dissemination pipelines between the main existing proteomics repositories.
http://www.proteomexchange.org
New in 2016
Vizcaíno et al., Nat Biotechnol, 2014
Deustch et al., NAR, 2017
4. PSI meeting 2017
Beijing, 24 April 2017
ProteomeCentral
Metadata /
Manuscript
Raw Data
Results
Journals
Peptide Atlas
Receiving repositories
PRIDE
Researcher’s results
Raw data
Metadata
PASSEL
Research
groups
Reanalysis of datasets
MassIVE
jPOST
MS/MS
data
(as complete
submissions)
Any other
workflow
(mainly partial
submissions)
DATASETS
SRM
data
Reprocessed results
MassIVE
ProteomeXchange data workflow
Vizcaíno et al., Nat Biotechnol, 2014
Deutsch et al., NAR, 2017
5. PSI meeting 2017
Beijing, 24 April 2017
ProteomeCentral
Metadata /
Manuscript
Raw Data
Results
Journals
UniProt/
neXtProtPeptide Atlas
Other DBs
Receiving repositories
PRIDE
GPMDBResearcher’s results
Raw data
Metadata
PASSEL
proteomicsDB
Research
groups
Reanalysis of datasets
MassIVE
jPOST
MS/MS
data
(as complete
submissions)
Any other
workflow
(mainly partial
submissions)
DATASETS
OmicsDI
Integration with other
omics datasets
SRM
data
Reprocessed results
MassIVE
ProteomeXchange data workflow
8. PSI meeting 2017
Beijing, 24 April 2017
Overview
• Introduction and status in 2017
• Submission, download and citation statistics
• New prospective member: Firmiana
• OmicsDI interface
9. PSI meeting 2017
Beijing, 24 April 2017
Countries with at least 100
datasets:
1105 USA
546 Germany
411 United Kingdom
356 China
229 France
188 Netherlands
178 Canada
150 Switzerland
125 Australia
123 Spain
123 Denmark
117 Japan
101 Sweden
Publicly Accessible:
2597 datasets, 57% of all
2334 PRIDE
135 MassIVE
115 PASSEL
13 jPOST
Datasets/year:
2012: 102
2013: 527
2014: 963
2015: 1758
2016 (till July): 1184
Top Species studied by at least 100
datasets:
2010 Homo sapiens
604 Mus musculus
191 Saccharomyces cerevisiae
140 Arabidopsis thaliana
127 Rattus norvegicus
936 reported taxa in total
ProteomeXchange: 4,534 datasets up until end of July 2016
10. PSI meeting 2017
Beijing, 24 April 2017
Countries with at least 100
submitted datasets :
1019 USA
734 Germany
492 United Kingdom
470 China
273 France
209 Netherlands
173 Canada
165 Switzerland
157 Australia
148 Austria
142 Denmark
137 Spain
115 Sweden
109 Japan
100 India
5,198 ProteomeXchange datasets in PRIDE (by March 2017)
Type:
3835 ‘Partial’ submissions (73.8%)
1363 ‘Complete’ submissions (26.2%)
Released: 3462 datasets (66.6%)
Unpublished: 1,736 datasets (33.4%)
Data volume in PRIDE:
Total: ~320 TB
Number of files: ~670,000
PXD000320-324: ~ 4 TB
PXD002319-26 ~2.4 TB
PXD001471 ~1.6 TB
Top Species represented (at least 100
datasets):
2267 Homo sapiens
765 Mus musculus
201 Saccharomyces cerevisiae
169 Arabidopsis thaliana
154 Rattus norvegicus
124Escherichia coli
~ 1000 species in total
1,940 datasets submitted to PRIDE in 2016
14. PSI meeting 2017
Beijing, 24 April 2017
Public proteomics datasets are being increasingly
reused…
Martens & Vizcaíno, Trends Bioch Sci, 2017
15. PSI meeting 2017
Beijing, 24 April 2017
Download volumes are increasing as well
Data download volume for
PRIDE Archive in 2016: 243 TB
0
50
100
150
200
250
300
2013 2014 2015 2016
Downloads in TBs
16. PSI meeting 2017
Beijing, 24 April 2017
Overview
• Introduction and status in 2017
• Submission, download and citation statistics
• New prospective member: Firmiana
• OmicsDI interface
18. PSI meeting 2017
Beijing, 24 April 2017
Funding and status
• BBSRC Partnering grants with China and Japan obtained in 2016.
• Proposal submitted to NIH on November 2016 (Call devoted to
data standards):
• Development of ProXI.
• Scoring received, waiting for the final decision.
• No further contacts with other proteomics resources.
19. PSI meeting 2017
Beijing, 24 April 2017
Overview
• Introduction and status
• Submission and citation statistics
• New prospective members: jPOST and iPROX
• OmicsDI interface
20. PSI meeting 2017
Beijing, 24 April 2017
ProteomeCentral: Portal for all PX datasets
http://proteomecentral.proteomexchange.org/cgi/GetDataset
21. PSI meeting 2017
Beijing, 24 April 2017
ProteomeCentral: Portal for all PX datasets
http://proteomecentral.proteomexchange.org/cgi/GetDataset
22. PSI meeting 2017
Beijing, 24 April 2017
Public datasets from different omics: OmicsDI
http://www.ebi.ac.uk/Tools/omicsdi/
• Aims to integrate of ‘omics’ datasets (proteomics,
transcriptomics, metabolomics and genomics at present).
PRIDE
MassIVE
jPOST
PASSEL
GPMDB
ArrayExpress
Expression Atlas
MetaboLights
Metabolomics Workbench
GNPS
EGA
Perez-Riverol et al., Nat Biotechnol, in press
25. PSI meeting 2017
Beijing, 24 April 2017
Aknowledgements: People
Attila Csordas
Tobias Ternent
Gerhard Mayer (de.NBI)
Yasset Perez-Riverol
Manuel Bernal-Llinares
Andrew Jarnuczak
Mathias Walzer
Former team members, especially:
Rui Wang
Florian Reisinger
Noemi del Toro
Jose A. Dianes
Henning Hermjakob
Acknowledgements: The PRIDE Team and all PX partners
All data submitters !!!
Eric Deutsch
Zhi Sun
David Campbell
Nuno Bandeira
Mingxun Wang
Jeremy Carver
Yasushi Ishihama
Shujiro Okuda
Shin Kawano
Follow new datasets @proteomexchange