Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
PRIDE-ProteomeXchange
1. PRIDE resources and ProteomeXchange
Dr. Juan Antonio Vizcaíno
PRIDE Group Coordinator
Proteomics Services Team
EMBL-EBI
Hinxton, Cambridge, UK
2. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Data resources at EMBL-EBI
Genes, genomes & variation
RNA Central
Array
Express
Expression Atlas
Metabolights
PRIDE
InterPro Pfam UniProt
ChEMBL ChEBI
Molecular structures
Protein Data Bank in Europe
Electron Microscopy Data Bank
European Nucleotide Archive
European Variation Archive
European Genome-phenome Archive
Gene, protein & metabolite expression
Protein sequences, families & motifs
Chemical biology
Reactions, interactions &
pathways
IntAct Reactome MetaboLights
Systems
BioModels Enzyme Portal BioSamples
Ensembl
Ensembl Genomes
GWAS Catalog
Metagenomics portal
Europe PubMed Central
Gene Ontology
Experimental Factor
Ontology
Literature & ontologies
3. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
• PRIDE Archive (in the context of ProteomeXchange
and the PSI standards)
• How to submit data to PRIDE: PRIDE tools
• How to access data in PRIDE Archive
• PRIDE Cluster and PRIDE Proteomes
Overview
4. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
• PRIDE Archive (in the context of
ProteomeXchange and the PSI standards)
• How to submit data to PRIDE: PRIDE tools
• How to access data in PRIDE Archive
• PRIDE Cluster and PRIDE Proteomes
Overview
5. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
ProteomeXchange Consortium
• Goal: Development of a framework to allow
standard data submission and dissemination
pipelines between the main existing proteomics
repositories.
• Includes PeptideAtlas (ISB, Seattle), PRIDE
(Cambridge, UK) and (very recently) MassIVE
(UCSD, San Diego).
• Common identifier space (PXD identifiers)
• Two supported data workflows: MS/MS and SRM.
• Main objective: Make life easier for researchers
http://www.proteomexchange.org
6. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
• PRIDE stores mass spectrometry (MS)-
based proteomics data:
• Peptide and protein expression data
(identification and quantification)
• Post-translational modifications
• Mass spectra (raw data and peak
lists)
• Technical and biological metadata
• Any other related information
• Full support for tandem MS approaches
PRIDE (PRoteomics IDEntifications) database
http://www.ebi.ac.uk/pride/archive
Martens et al., Proteomics, 2005
Vizcaíno et al., NAR, 2013
7. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Mission
• To archive all types of proteomics mass
spectrometry data for the purpose of supporting
reproducible research, allowing the application of
quality control metrics and enabling the reuse of
these data by other researchers.
• To integrate MS-based data in a protein-centric
manner to provide information on protein variants,
modifications, and expression.
• To provide mass spectrometry based expression
data to the Expression Atlas.
8. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Mission
• To archive all types of proteomics mass
spectrometry data for the purpose of supporting
reproducible research, allowing the application of
quality control metrics and enabling the reuse of
these data by other researchers.
• To integrate MS-based data in a protein-centric
manner to provide information on protein variants,
modifications, and expression.
• To provide mass spectrometry based expression
data to the Expression Atlas.
9. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Data content in PRIDE Archive
• Submission driven resource
• PRIDE is split in datasets (group of assays)
• An assay represents one MS run (in most cases).
• No data reprocessing at present. PRIDE aims to represent
the author’s view on the data
• Supported formats: PRIDE XML and mzIdentML.
• Raw data is also now stored
10. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
What is a proteomics publication in 2015?
• Proteomics studies generate potentially large amounts of
data and results.
• Ideally, a proteomics publication needs to:
• Summarize the results of the study
• Provide supporting information for reliability of any
results reported
• Information in a publication:
• Manuscript
• Supplementary material
• Associated data submitted to a public repository
11. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Journal Submission Recommendations
• Journal guidelines recommend submission to proteomics
repositories:
Proteomics
Nature Biotechnology
Nature Methods
Molecular and Cellular Proteomics
• Funding agencies are enforcing public deposition of data
to maximize the value of the funds provided.
12. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE: Source of MS proteomics data
• PRIDE Archive already provides or
will soon provide MS proteomics
data to other EMBL-EBI resources
such as UniProt, Ensembl and the
Expression Atlas.
http://www.ebi.ac.uk/pride
13. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
ProteomeXchange Consortium
• Goal: Development of a framework to allow
standard data submission and dissemination
pipelines between the main existing proteomics
repositories.
• Includes PeptideAtlas (ISB, Seattle), PRIDE
(Cambridge, UK) and (very recently) MassIVE
(UCSD, San Diego).
• Common identifier space (PXD identifiers)
• Two supported data workflows: MS/MS and SRM.
• Main objective: Make life easier for researchers
http://www.proteomexchange.org
14. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
ProteomeCentral
Metadata /
Manuscript
Raw Data*
Results
Journals
UniProt/
neXtProt
Peptide Atlas
Other DBs
Receiving repositories
PASSEL
(SRM data)
PRIDE
(MS/MS data)
Other DBs
GPMDB
Researcher’s results
Reprocessed results
Raw data*
Metadata
MassIVE
(MS/MS data)
Vizcaíno et al., Nat Biotechnol, 2014
ProteomeXchange data workflow
15. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
• PRIDE Archive (in the context of ProteomeXchange
and the PSI standards)
• How to submit data to PRIDE: PRIDE tools
• How to access data in PRIDE Archive
• A sneak peak to other PRIDE resources
Overview
16. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
ProteomeCentral
Metadata /
Manuscript
Raw Data*
Results
Journals
UniProt/
neXtProt
Peptide Atlas
Other DBs
Receiving repositories
PASSEL
(SRM data)
PRIDE
(MS/MS data)
Other DBs
GPMDB
Researcher’s results
Reprocessed results
Raw data*
Metadata
MassIVE
(MS/MS data)
Vizcaíno et al., Nat Biotechnol, 2014
ProteomeXchange data workflow
17. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Complete
Partial
Complete vs Partial submissions: processed results
For complete submissions, it is possible to connect the spectra with the identification
processed results and they can be visualized.
18. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Complete vs Partial submissions: experimental metadata
Complete Partial
General experimental metadata about the projects is similar.
However, at the assay level information in partial submissions is not so detailed
19. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
How to perform a complete PX submission to PRIDE
• Decide between a complete/partial submission.
• File conversion/export to PRIDE XML or mzIdentML
• File check before submission (PRIDE Inspector)
• Experimental annotation and actual file submission (PX
submission tool)
• Post-submission steps
20. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PX Data workflow for MS/MS data
1. Mass spectrometer output files: raw data (binary files) or
peak list spectra in a standardized format (mzML, mzXML).
2. Result files:
a. Complete submissions: Result files can be converted to
PRIDE XML or the mzIdentML data standard.
b. Partial submissions: For workflows not yet supported by
PRIDE, search engine output files will be stored and
provided in their original form.
3. Metadata: Sufficiently detailed description of sample origin,
workflow, instrumentation, submitter.
4. Other files: Optional files:
a. QUANT: Quantification related results e. FASTA
b. PEAK: Peak list files f. SP_LIBRARY
c. GEL: Gel images
d. OTHER: Any other file type
Published
Raw
Files
Other
files
21. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PX Data workflow for MS/MS data
1. Mass spectrometer output files: raw data (binary files) or
peak list spectra in a standardized format (mzML, mzXML).
2. Result files:
a. Complete submissions: Result files can be converted to
PRIDE XML or the mzIdentML data standard.
b. Partial submissions: For workflows not yet supported by
PRIDE, search engine output files will be stored and
provided in their original form.
3. Metadata: Sufficiently detailed description of sample origin,
workflow, instrumentation, submitter.
4. Other files: Optional files (the list can be extended):
a. QUANT: Quantification related results e. FASTA
b. PEAK: Peak list files f. SP_LIBRARY
c. GEL: Gel images
d. OTHER: Any other file type
Published
Raw
Files
Other
files
22. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Components: Submission Process
PRIDE Converter 2
PRIDE Inspector PX Submission Tool
mzIdentML
PRIDE XML
1
23. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Search
output
files
Spectra
files
Original data files ‘RESULT’ file generation Final ‘RESULT’ file
PRIDE
XML
‘RESULT’
Before: only file conversion to PRIDE XML
File conversion
PRIDE
Converter
Other tools, e.g. hEIDI
Barsnes et al., Nat Biotechnol, 2009
Cote et al., MCP, 2012
24. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Tools ‘RESULT’ file generation Final ‘RESULT’ file
mzIdentML
‘RESULT’
Now: native file export to mzIdentML
Spectra
files
(mzML,
mzXML,
mzData,
mgf,
pkl,
ms2,
dta, apl)
Mascot
ProteinPilo
t
Scaffold
PEAKS
MSGF+
Others
Native File export
25. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Complete submissions
Search
Engine
Results +
MS files
Search
engines
mzIdentML
- Mascot
- MSGF+
- MyriMatch and related tools from D. Tabb’s lab
- OpenMS
- PEAKS
- PeptideShaker
- ProCon (ProteomeDiscoverer, Sequest)
- Scaffold
- TPP via the idConvert tool (ProteoWizard)
- ProteinPilot (from version 5.0)
- X!Tandem native conversion (Beta,
PILEDRIVER)
- Others: library for X!Tandem conversion, lab
internal pipelines, …
- Crux
An increasing number of tools support export to mzIdentML
1.1
- Referenced spectral files need to be submitted as well
(all open formats are supported).
Updated list: http://www.psidev.info/tools-implementing-
mzIdentML#.
26. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Components: Submission Process
PRIDE Converter 2
PRIDE Inspector PX Submission Tool
mzIdentML
PRIDE XML
2
27. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Inspector Toolsuite
Wang et al., Nat. Biotechnology, 2012
Perez-Riverol et al., MCP, 2016, in press
PRIDE Inspector
PRIDE Inspector 2 supports:
- PRIDE XML
- mzIdentML + all types of spectra files
- mzML
- mzTab identification and Quantification
https://github.com/PRIDE-Toolsuite/
28. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Inspector 2
PRIDE Inspector 2
https://github.com/PRIDE-Toolsuite/
New visualisation
functionality for Protein
Groups
29. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Components: Submission Process
PRIDE Converter 2
PRIDE Inspector PX Submission Tool
mzIdentML
PRIDE XML
3
30. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
• Capture the mappings between the different types of files.
• Make the file upload process straightforward to the submitter (It transfers all the
files using Aspera or FTP).
PX submission tool
Published
Raw
Other
files
http://www.proteomexchange.org/submission
PX
submission
tool
• Command line alternative: Using the Aspera file transfer protocol.
32. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Fast file transfer with Aspera
- Aspera is the default file transfer protocol to PRIDE:
- PX Submission tool
- Command line
- Up to 50X faster than FTP
File transfer speed should
not be a problem!!
33. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Manuscript published detailing the process
Ternent et al., Proteomics, 2014http://www.proteomexchange.org/submission
Example dataset:
PXD000764
- Title: “Discovery of new CSF biomarkers for meningitis in children”
- 12 runs: 4 controls and 8 infected samples
- Identification and quantification data
34. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Archive: Number of submitted datasets in 2015
0
20
40
60
80
100
120
140
160
180
200
Number of submitted datasets to PRIDE Archive per month (November 1st
2015)
35. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
ProteomeXchange: 2,774 datasets up until 1st September, 2015
Type:
1681 PRIDE partial
813 PRIDE complete
173 MassIVE
84 PeptideAtlas/PASSEL complete
23 Reprocessed
Publicly Accessible:
1372 datasets, 49% of all
90% PRIDE
6% PASSEL
4% MassIVE
Data volume:
Total: ~150 TB
Number of all files: ~400,000
PXD000320-324: ~ 4 TB
PXD002319-26 ~2.4 TB
PXD001471 ~1.6 TB
Datasets/year:
2012: 102
2013: 527
2014: 963
2015: 1182
Top Species studied by at least 20
datasets:
1080 Homo sapiens
335 Mus musculus
110 Saccharomyces cerevisiae
98 Arabidopsis thaliana
75 Rattus norvegicus
58 Escherichia coli
29 Bos taurus
23 Glycine max
20 Caenorhabditis elegans
20 Oryza sativa
~ 500 species in total
Origin:
714 USA
313 Germany
252 United Kingdom
163 China
146 France
121 Netherlands
108 Switzerland
103 Canada
81 Denmark
73 Spain
68 Japan
67 Australia
63 Sweden
57 Belgium
43 Austria
39 India
34 Taiwan
33 Norway
26 Italy
24 Ireland
24 Finland
21 Republic of Korea
20 Brazil
20 Russia
18 Israel
18 Singapore …
36. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Public data release: when does it happen?
• When the author tells us to do it (the authors can do it by
themselves)
• When we find out that a dataset has been published
• We look for PXD identifiers in PubMed abstracts.
• If your PXD identifier is not in the abstract, a paper may have
been published and the data is still private. Let us know!
• New web form in the PRIDE web to facilitate the process
37. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Partial submissions can be used to store
other data types
• Everything can be stored, not only MS/MS data: very flexible
mechanism to be able to capture all types of datasets
• PRIDE does not store SRM data (it goes to PASSEL)
• Top down proteomics datasets.
• Mass Spectrometry Imaging datasets.
• Data independent acquisition techniques: e.g. SWATH-MS datasets.
38. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
C
D
From original publication [13] Reconstructed ProteomeXchange data
1. Thermo RAW data / UDP
2. Mirion Software (JLU)
1. Thermo RAW data / UDP
2. Convert to imzML
3. Upload to PRIDE
(EBI, Cambridge, UK)
4. Download from PRIDE
5. Display in MSiReader
- Vendor-independent data format
- Freely available software (open source)
- ‘open data‘ – free to reuse
- Anybody can do this!
A public repository for mass spectrometry imaging data
Römpp et al., 2015
PRIDE
database
European
Bioinformatics
Institute,
Cambridge, UK
3. Upload
4. Download
39. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
• PRIDE Archive (in the context of ProteomeXchange
and the PSI standards)
• How to submit data to PRIDE: PRIDE tools
• How to access data in PRIDE Archive
• PRIDE Cluster and PRIDE Proteomes
Overview
40. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Data access to PRIDE Archive
• Look for particular datasets of interest:
• For data reuse: which particular proteins and peptides
(including PTMs) have been detected.
• Data reinterpretation or re-analysis.
• Validation of the experimental results reported.
• Specific use cases for proteomics: spectral libraries,
fragmentation models, SRM transitions,…
41. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
ProteomeCentral
Metadata /
Manuscript
Raw Data*
Results
Journals
UniProt/
neXtProt
Peptide Atlas
Other DBs
Receiving repositories
PASSEL
(SRM data)
PRIDE
(MS/MS data)
Other DBs
GPMDB
Researcher’s results
Reprocessed results
Raw data*
Metadata
MassIVE
(MS/MS data)
Vizcaíno et al., Nat Biotechnol, 2014
ProteomeXchange data workflow
42. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
ProteomeCentral: Portal for all PX datasets
http://proteomecentral.proteomexchange.org/cgi/GetDataset
43. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
RSS feed for public datasets
http://groups.google.com/group/proteomexchange/feed/rss_v2_0_msgs.xml
44. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Ways to access data in PRIDE Archive
• PRIDE web interface
• File repository
• REST web service
• PRIDE Inspector tool
47. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Vaudel M, Barsnes H, Berven FS, Sickmann A,
Martens L:
Proteomics 2011;11(5):996-9.
https://github.com/compomics/searchgui https://github.com/compomics/peptide-shaker
Vaudel M, Burkhart J, Zahedi RP, Berven FS, Sickmann A, Martens L,
Barsnes H:
Nature Biotechnology 2015; 33(1):22-24.
CompOmics Open Source Analysis Pipeline
48. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Find the desired PRIDE project …
… and start re-analyzing the data!
… inspect the project details ….
Reshake PRIDE data!
49. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
• PRIDE Archive (in the context of ProteomeXchange
and the PSI standards)
• How to submit data to PRIDE: PRIDE tools
• How to access data in PRIDE Archive
• PRIDE Cluster and PRIDE Proteomes
Overview
51. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE
Archive
Aggr
egati
on
PRIDE
Cluster
Basic QC
checks for
PSMs
Reprocessed
datasets
Original
Submissions
Link to the original evidence
For original results
PRIDE
Proteomes
52. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Sneak peak
• Provide an aggregated and QC filtered peptide-centric
and protein centric view on PRIDE Archive data.
http://www.ebi.ac.uk/pride/cluster/http://wwwdev.ebi.ac.uk/pride/proteomes/
53. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Cluster - Concept
• Use spectral clustering to reliably group spectra coming
from the same peptide
• Infer reliable identifications by comparing submitted
identifications of spectra within a cluster
• Increases quality through data increase (taking
advantage of the wealth of data in PRIDE).
• Inherently adapts to new (labelling) techniques
Griss et al., Nat Methods, 2013
54. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Cluster - Concept
Griss et al., Nat Methods, 2013
NMMAACDPR
NMMAACDPR
PPECPDFDPPR
NMMAACDPR
Consensus spectrum
PPECPDFDPPR
NMMAACDPR
NMMAACDPR
Threshold: At least 10 spectra in
a cluster and ratio >70%.
Originally submitted identified spectra
55. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Cluster Home page
http://www.ebi.ac.uk/pride/cluster/#/
56. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Cluster: result of searches
http://www.ebi.ac.uk/pride/cluster/#/
A couple of examples …
57. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Examples: one perfect cluster
- 880 PSMs give the same peptide ID
- 4 species
- 28 datasets
- Same instruments
59. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Examples: one perfect cluster (3)
What does that peptide sequence correspond to?
62. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Examples: one perfect cluster (3)
What does that peptide sequence correspond to?
63. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Cluster – Spectral libraries
http://www.ebi.ac.uk/pride/cluster/#/libraries
64. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE Proteomes: reusing PRIDE Cluster data
• Condensed and cross-dataset view of PRIDE Archive for
identification data:
• Data filtering of PSMs is performed at the level of the
submitted data.
• PSMs are grouped as peptide sequences.
• The peptide sequences are remapped to a recent version of
UniProtKB (at present UniProtKB “complete proteome”).
• Linked to the original supporting evidence.
• “PRIDE Cluster” used as an extra evidence for the PSMs.
http://wwwdev.ebi.ac.uk/pride/proteomes/
65. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
PRIDE: Using it for giving reliability to IDs
Link to PRIDE
Cluster web
http://wwwdev.ebi.ac.u
k/pride/proteomes/
66. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Examples: one perfect cluster
- 880 PSMs give the same peptide ID
- 4 species
- 28 datasets
- Same instruments
67. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
• Main characteristics of PRIDE Archive and
ProteomeXchange
• PX/PRIDE submission workflow for MS/MS data
• PRIDE Inspector
• PX submission tool
• PRIDE/ProteomeXchange has become the de facto
standard for data submission and data availability in
proteomics
• PRIDE Proteomes and PRIDE Cluster: new resources
Conclusions
68. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Do you want to know a bit more…?
http://www.slideshare.net/JuanAntonioVizcaino
69. Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2015
Hinxton, 10 December 2015
Aknowledgements: People
Attila Csordas
Tobias Ternent
Noemi del Toro
Johannes Griss
Yasset Perez-Riverol
Henning Hermjakob
All past team members, especially
Rui Wang, Florian Reisinger and
Jose A. Dianes
All ProteomeXchange partners,
especially Eric Deutsch and Nuno
Bandeira
Acknowledgements: The PRIDE Team and collaborators