SlideShare a Scribd company logo
1 of 50
PRIDE and ProteomeXchange: Share and
explore public proteomics datasets like never
before
Dr. Juan Antonio Vizcaíno
PRIDE Group Coordinator
Proteomics Services Team
EMBL-EBI
Hinxton, Cambridge, UK
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
• PRIDE Archive (in the context of ProteomeXchange and
the PSI standards)
• How to submit data to PRIDE: PRIDE tools
• ProteomeCentral, submission and access stats
Overview
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Data sharing in Proteomics
• Public availability of data in proteomics enables:
• Reinterpretation (e.g. data reprocessing with different aims)
• Improved analysis software.
• Change in protein sequence databases (e.g. proteogenomics
studies).
• Consider new post-translational modifications.
• validation of the experimental results reported.
• Specific use cases for proteomics: spectral libraries,
fragmentation models, SRM transitions,…
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PRIDE (PRoteomics IDEntifications) database
http://www.ebi.ac.uk/pride
• PRIDE stores mass spectrometry (MS)-
based proteomics data:
• Peptide and protein expression data
(identification and quantification)
• Post-translational modifications
• Mass spectra (raw data and peak lists)
• Technical and biological metadata
• Any other related information
• Full support for tandem MS approaches
Martens et al., Proteomics, 2005
Vizcaíno et al., NAR, 2013
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PRIDE Archive
• New PRIDE DB archival system from 01/2014. Three iterations
released so far. Still work in progress.
• Very flexible, its development has happened in parallel with:
• Implementation of ProteomeXchange.
• New community PSI data standards: mzIdentML, mzQuantML and
mzTab.
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
ProteomeXchange Consortium
• Goal: Development of a framework to allow standard
data submission and dissemination pipelines
between the main existing proteomics repositories.
• Includes PeptideAtlas (ISB, Seattle), PRIDE
(Cambridge, UK) and MassIVE (UCSD, San Diego).
• Tranche and Peptidome initially included but
discontinued.
• Common identifier space (PXD identifiers)
• Two supported data workflows: MS/MS and SRM.
• Main objective: Make life easier for researchers
http://www.proteomexchange.org Vizcaíno et al., Nat Biotechnol, 2014
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Current PSI Standard File Formats for MS
• mzTab (Griss et al., MCP, 2014)Final Results
• TraML (Deutsch et al., MCP, 2012)SRM
• mzQuantML (Walter et al., MCP, 2013)Quantitation
• mzIdentML (Jones et al., MCP, 2012)Identification
• mzML (Martens et al., MCP, 2011)MS data
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Current PSI Standard File Formats for MS
• mzTab (Griss et al., MCP, 2014)Final Results
• TraML (Deutsch et al., MCP, 2012)SRM
• mzQuantML (Walter et al., MCP, 2013)Quantitation
• mzIdentML (Jones et al., MCP, 2012)Identification
• mzML (Martens et al., MCP, 2011)MS data
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
mzTab format: tab delimited format (ident/quant)
http://code.google.com/p/mztab/
J. Griss et al., MCP, 2014
Q.W. Xu et al., Proteomics, 2014
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Ways to access data in PRIDE Archive
• PRIDE web interface
• File repository
• REST web service
• PRIDE Inspector tool
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PRIDE Archive web interface
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PRIDE Archive web interface (2)
• Next: visualization of
spectra (in a couple of
weeks)
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Programmatic access: PRIDE REST web service
http://www.ebi.ac.uk/pride/ws/archive/
• Intending to replace the
most popular functionality
provided by the PRIDE
Biomart interface (now
discontinued)
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
• Introduction to PRIDE Archive (in the context of
ProteomeXchange and the PSI standards)
• How to submit data to PRIDE: PRIDE tools
• ProteomeCentral, submission and access stats
• A sneak peak about data reuse
Overview
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
ProteomeCentral
Metadata /
Manuscript
Raw Data*
Results
Journals
UniProt/
neXtProt
Peptide Atlas
Other DBs
Receiving repositories
PASSEL
(SRM data)
PRIDE
(MS/MS data)
Other DBs
GPMDB
Researcher’s results
Reprocessed results
Raw data*
Metadata
MassIVE
(MS/MS data)
ProteomeXchange data workflow: PRIDE
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Manuscript published detailing the process
Ternent et al., Proteomics, 2014http://www.proteomexchange.org/submission
Example dataset:
PXD000764
- Title: “Discovery of new CSF biomarkers for meningitis in children”
- 12 runs: 4 controls and 8 infected samples
- Identification and quantification data
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PX Data workflow for MS/MS data
1. Mass spectrometer output files: raw data (binary files) or peak list
spectra in a standardized format (mzML, mzXML).
2. Result files:
a. Complete submissions: Result files can be converted to
PRIDE XML or the mzIdentML data standard.
b. Partial submissions: For workflows not yet supported by
PRIDE, search engine output files will be stored and provided in
their original form.
3. Metadata: Sufficiently detailed description of sample origin,
workflow, instrumentation, submitter.
4. Other files: Optional files:
a. QUANT: Quantification related results e. FASTA
b. PEAK: Peak list files f. SP_LIBRARY
c. GEL: Gel images
d. OTHER: Any other file type
Published
Raw
Files
Other files
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Complete vs Partial submissions: experimental metadata
Complete Partial
General experimental metadata about the projects is similar.
However, at the assay level information in partial submissions is not so detailed
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Complete
Partial
Complete vs Partial submissions: processed results
For complete submissions, it is possible to connect the spectra with the identification
processed results and they can be visualized.
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PX Data workflow for MS/MS data
1. Mass spectrometer output files: raw data (binary files) or peak list
spectra in a standardized format (mzML, mzXML).
2. Result files:
a. Complete submissions: Result files can be converted to
PRIDE XML or the mzIdentML data standard.
b. Partial submissions: For workflows not yet supported by
PRIDE, search engine output files will be stored and provided in
their original form.
3. Metadata: Sufficiently detailed description of sample origin,
workflow, instrumentation, submitter.
4. Other files: Optional files (the list can be extended):
a. QUANT: Quantification related results e. FASTA
b. PEAK: Peak list files f. SP_LIBRARY
c. GEL: Gel images
d. OTHER: Any other file type
Published
Raw
Files
Other files
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PRIDE Components: Submission Process
PRIDE Converter 2
PRIDE Inspector PX Submission Tool
mzIdentML
PRIDE XML
1
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Search
output
files
Spectra
files
Original data files ‘RESULT’ file generation Final ‘RESULT’ file
PRIDE
XML
‘RESULT’
Before: only file conversion to PRIDE XML
File conversion
PRIDE
Converter
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Tools ‘RESULT’ file generation Final ‘RESULT’ file
mzIdentML
‘RESULT’
Now: native file export
Spectra
files
Mascot
ProteinPilot
Scaffold
PEAKS
MSGF+
Others
Native File export
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Complete submissions
Search
Engine
Results + MS
files
Search
engines
mzIdentML
- Mascot
- MSGF+
- Myrimatch and related tools from D. Tabb’s lab
- OpenMS
- PEAKS
- PeptideShaker
- ProCon (ProteomeDiscoverer, Sequest)
- Scaffold
- TPP via the idConvert tool (ProteoWizard)
- ProteinPilot (from version 5.0)
- Others: library for X!Tandem conversion, lab
internal pipelines, …
- Crux
An increasing number of tools support export to mzIdentML 1.1
- Referenced spectral files need to be submitted as well
(all open formats are supported).
Updated list: http://www.psidev.info/tools-implementing-mzIdentML#.
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Tools ‘RESULT’ file generation Final ‘RESULT’ file
mzTab
‘RESULT’
In the near future: native file export
Spectra
files
Mascot
ProteinPilot
Scaffold
PEAKS
MSGF+
Others
Native File export
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PRIDE Components: Submission Process
PRIDE Converter 2
PRIDE Inspector PX Submission Tool
mzIdentML
PRIDE XML
2
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PRIDE Inspector 2
Wang et al., Nat. Biotechnology, 2012
PRIDE Inspector 2
PRIDE Inspector 2 supports:
- PRIDE XML
- mzIdentML + all types of spectra files
- mzML
- mzTab Quantitation (work in progress)
https://github.com/PRIDE-Toolsuite/
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PRIDE Inspector 2
PRIDE Inspector 2
https://github.com/PRIDE-Toolsuite/
New visualisation
functionality for Protein
Groups
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PRIDE Inspector 2
PRIDE Inspector 2
Private review of files
submitted to PRIDE
https://github.com/PRIDE-Toolsuite/
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PRIDE Components: Submission Process
PRIDE Converter 2
PRIDE Inspector PX Submission Tool
mzIdentML
PRIDE XML
3
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
• It selects and captures the mappings between the different types of files included in the
submission.
• It transfers all the files using Aspera (default) or FTP.
PX submission tool
Published
Raw
Other
files
http://www.proteomexchange.org/submission
PX
submission
tool
• Command line alternative: some scripting is needed
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PX submission tool: screenshots
Step 3
Step 4
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Fast file transfer with Aspera
- Aspera is the default file transfer protocol to PRIDE:
- PX Submission tool
- Command line
- Up to 50X faster than FTP
File transfer speed should not
be a problem!!
- Also now available for downloading files
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Partial submissions can be used to store other data workflows
• Everything can be stored, not only MS/MS data (~90% of datasets):
very flexible mechanism to be able to capture all types of datasets
• PRIDE does not store SRM data (it goes to PASSEL)
• Top down proteomics datasets (10 datasets).
• Mass Spectrometry Imaging datasets (1 dataset).
• Data independent acquisition techniques: e.g. SWATH-MS (9
datasets), HDMSE (1 dataset).
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
C
D
From original publication [13] Reconstructed ProteomeXchange data
1. Thermo RAW data / UDP
2. Mirion Software (JLU)
1. Thermo RAW data / UDP
2. Convert to imzML
3. Upload to PRIDE
(EBI, Cambridge, UK)
4. Download from PRIDE
5. Display in MSiReader
- Vendor-independent data format
- Freely available software (open source)
- ‘open data‘ – free to reuse
- Anybody can do this!
Römpp et al., 2014, Anal Bioanal Chem, in press
PRIDE database
European
Bioinformatics
Institute,
Cambridge, UK
3. Upload
4. Download
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
• Introduction to PRIDE Archive (in the context of
ProteomeXchange and the PSI standards)
• How to submit data to PRIDE: PRIDE tools
• ProteomeCentral, submission and access stats
Overview
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
ProteomeCentral
Metadata /
Manuscript
Raw Data*
Results
Journals
UniProt/
neXtProt
Peptide Atlas
Other DBs
Receiving repositories
PASSEL
(SRM data)
PRIDE
(MS/MS data)
Other DBs
GPMDB
Researcher’s results
Reprocessed results
Raw data*
Metadata
MassIVE
(MS/MS data)
ProteomeXchange data workflow: ProteomeCentral
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
ProteomeCentral: Portal for all PX datasets
http://proteomecentral.proteomexchange.org/cgi/GetDataset
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
ProteomeCentral: Portal for all PX datasets
http://proteomecentral.proteomexchange.org/cgi/GetDataset
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
RSS feed for public datasets
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Origin:
322 USA
197 Germany
148 United Kingdom
91 Netherlands
85 France
81 China
80 Switzerland
61 Canada
48 Belgium
47 Spain
45 Denmark
42 Australia
40 Japan
37 Sweden
28 Austria
22 India
21 Norway
21 Taiwan
20 Ireland
20 Finland
17 Italy
14 Brazil
13 Republic of Korea
13 Russia
10 Israel
9 Singapore …
ProteomeXchange: 1620 datasets up until 8th January 2015
Type:
526 PRIDE complete (32.5%)
982 PRIDE partial (60.6%)
63 PeptideAtlas/PASSEL complete
24 MassIVE
25 reprocessed
Publicly Accessible:
814 datasets, 50% of all
90% PRIDE
8% PASSEL
2% MassIVE
Data volume:
Total: ~71 TB
Number of all files: ~160,000
PXD000320-324: ~ 5 TB
PXD000065: ~ 1.4TB
Top Species studied by at least 10 datasets:
712 Homo sapiens
193 Mus musculus
65 Saccharomyces cerevisiae
61 Arabidopsis thaliana
35 Rattus norvegicus
34 Escherichia coli
17 Bos taurus
17 Glycine max
17 Mycobacterium tuberculosis
16 Drosophila melanogaster
14 Oryza sativa
~ 310 species in total
Datasets/year:
2012: 102
2013: 527
2014: 963
2015: 28
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
PRIDE: Submitted datasets per month
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Access statistics: PRIDE File repository
2014: The rise of proteomics data re-use
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Which are the most accessed datasets?
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Vaudel M, Barsnes H, Berven FS, Sickmann A, Martens L:
Proteomics 2011;11(5):996-9.
http://searchgui.googlecode.com http://peptide-shaker.googlecode.com
Vaudel M, Burkhart J, Zahedi RP, Berven FS, Sickmann A, Martens L, Barsnes
H:
Nature Biotechnology 2015; 33(1):22-4.
PeptideShaker facilitates reuse of PRIDE data
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Draft Human proteome papers published in 2014
Wilhelm et al., Nature, 2014
•Around 60% of the data used for the
analysis comes from previous experiments,
most of them stored in proteomics repositories
such as PRIDE/ProteomeXchange, PASSEL
or MassIVE.
•They complement that data with “exotic”
tissues.
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
• Data submission and data reuse in the field are rising.
• PRIDE and ProteomeXchange enable this for you.
• Data standards are key for us.
• Quantification data depends on mzTab support.
Conclusions
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Aknowledgements: People
Attila Csordas
Tobias Ternent
Noemi del Toro
Rui Wang
Florian Reisinger
Jose A. Dianes
Johannes Griss
Steven Lewis
Yasset Perez-Riverol
Henning Hermjakob
All ProteomeXchange partners,
especially Eric Deutsch and Nuno
Bandeira
Acknowledgements: The PRIDE Team and collaborators
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Acknowledgements: Funding
pride-ebi@ebi.ac.uk
pride-support@ebi.ac.uk
http://www.proteomexchange.org
http://code.google.com/p/pride-converter-2/
@pride_ebi
Acknowledgements
Juan A. Vizcaíno
juan@ebi.ac.uk
Midwinter Proteomics Bioinformatics Seminar
Semmering, 15 January 2015
Questions?

More Related Content

What's hot

The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...Ben Blaiszik
 
2017 - METASPACE training guide
2017 - METASPACE training guide2017 - METASPACE training guide
2017 - METASPACE training guideMETASPACE
 
ARCHIVED: new version available. 2016 - METASPACE Training Course
ARCHIVED: new version available. 2016 - METASPACE Training CourseARCHIVED: new version available. 2016 - METASPACE Training Course
ARCHIVED: new version available. 2016 - METASPACE Training CourseMETASPACE
 
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseMaking Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseJustin Clark-Casey
 
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Bertram Ludäscher
 
Zelditchetal workbookgeomorphoanalyses
Zelditchetal workbookgeomorphoanalysesZelditchetal workbookgeomorphoanalyses
Zelditchetal workbookgeomorphoanalysesWagner M. S. Sampaio
 
DataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE
 
247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research Workbench247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research WorkbenchStuart Chalk
 
Acs collaborative computational technologies for biomedical research an enabl...
Acs collaborative computational technologies for biomedical research an enabl...Acs collaborative computational technologies for biomedical research an enabl...
Acs collaborative computational technologies for biomedical research an enabl...Sean Ekins
 

What's hot (13)

The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
2017 - METASPACE training guide
2017 - METASPACE training guide2017 - METASPACE training guide
2017 - METASPACE training guide
 
ARCHIVED: new version available. 2016 - METASPACE Training Course
ARCHIVED: new version available. 2016 - METASPACE Training CourseARCHIVED: new version available. 2016 - METASPACE Training Course
ARCHIVED: new version available. 2016 - METASPACE Training Course
 
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data WarehouseMaking Linked Data SPARQL with the InterMine Biological Data Warehouse
Making Linked Data SPARQL with the InterMine Biological Data Warehouse
 
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
 
Zelditchetal workbookgeomorphoanalyses
Zelditchetal workbookgeomorphoanalysesZelditchetal workbookgeomorphoanalyses
Zelditchetal workbookgeomorphoanalyses
 
DataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management Planning
 
PEDSnet : 18 month summary on data integration and data quality
PEDSnet : 18 month summary on data integration and data qualityPEDSnet : 18 month summary on data integration and data quality
PEDSnet : 18 month summary on data integration and data quality
 
Pride cluster presentation
Pride cluster presentation Pride cluster presentation
Pride cluster presentation
 
247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research Workbench247th ACS Meeting: The Eureka Research Workbench
247th ACS Meeting: The Eureka Research Workbench
 
Acs collaborative computational technologies for biomedical research an enabl...
Acs collaborative computational technologies for biomedical research an enabl...Acs collaborative computational technologies for biomedical research an enabl...
Acs collaborative computational technologies for biomedical research an enabl...
 

Viewers also liked

MS Imaging data in ProteomeXchange (HUPO 2014)
MS Imaging data in ProteomeXchange (HUPO 2014)MS Imaging data in ProteomeXchange (HUPO 2014)
MS Imaging data in ProteomeXchange (HUPO 2014)Juan Antonio Vizcaino
 
Submitting your data to ProteomeXchange – a mini tutorial
Submitting your data to ProteomeXchange – a mini tutorialSubmitting your data to ProteomeXchange – a mini tutorial
Submitting your data to ProteomeXchange – a mini tutorialJuan Antonio Vizcaino
 
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...Juan Antonio Vizcaino
 
AHUPO_Vizcaino_remote_presentation_082014
AHUPO_Vizcaino_remote_presentation_082014AHUPO_Vizcaino_remote_presentation_082014
AHUPO_Vizcaino_remote_presentation_082014Juan Antonio Vizcaino
 
Dynamic linkage of public proteomics data in Ensembl using TrackHubs
Dynamic linkage of public proteomics data in Ensembl using TrackHubsDynamic linkage of public proteomics data in Ensembl using TrackHubs
Dynamic linkage of public proteomics data in Ensembl using TrackHubsJuan Antonio Vizcaino
 
ELIXIR Pilot Actions launched in 2014: Integration of BILS-ProteomeXchange us...
ELIXIR Pilot Actions launched in 2014: Integration of BILS-ProteomeXchange us...ELIXIR Pilot Actions launched in 2014: Integration of BILS-ProteomeXchange us...
ELIXIR Pilot Actions launched in 2014: Integration of BILS-ProteomeXchange us...Juan Antonio Vizcaino
 
ProteomeXchange: data deposition and data retrieval made easy
ProteomeXchange: data deposition and data retrieval made easyProteomeXchange: data deposition and data retrieval made easy
ProteomeXchange: data deposition and data retrieval made easyJuan Antonio Vizcaino
 
Mass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progressMass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progressJuan Antonio Vizcaino
 
PRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinarPRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinarJuan Antonio Vizcaino
 
Data Independent Analysis on Thermo Scientific Orbitrap MS Systems
Data Independent Analysis on Thermo Scientific Orbitrap MS SystemsData Independent Analysis on Thermo Scientific Orbitrap MS Systems
Data Independent Analysis on Thermo Scientific Orbitrap MS SystemsThermo Fisher Scientific
 
New DIA Workflows for Ultimate Flexibility in LCMS Proteomics
New DIA Workflows for Ultimate Flexibility in LCMS ProteomicsNew DIA Workflows for Ultimate Flexibility in LCMS Proteomics
New DIA Workflows for Ultimate Flexibility in LCMS Proteomicsthermo_omics
 
The mzTab data standard format for reporting MS-based peptide, protein and sm...
The mzTab data standard format for reporting MS-based peptide, protein and sm...The mzTab data standard format for reporting MS-based peptide, protein and sm...
The mzTab data standard format for reporting MS-based peptide, protein and sm...Juan Antonio Vizcaino
 

Viewers also liked (14)

MS Imaging data in ProteomeXchange (HUPO 2014)
MS Imaging data in ProteomeXchange (HUPO 2014)MS Imaging data in ProteomeXchange (HUPO 2014)
MS Imaging data in ProteomeXchange (HUPO 2014)
 
Submitting your data to ProteomeXchange – a mini tutorial
Submitting your data to ProteomeXchange – a mini tutorialSubmitting your data to ProteomeXchange – a mini tutorial
Submitting your data to ProteomeXchange – a mini tutorial
 
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
ProteomeXchange Experience: PXD Identifiers and Release of Data on Acceptance...
 
AHUPO_Vizcaino_remote_presentation_082014
AHUPO_Vizcaino_remote_presentation_082014AHUPO_Vizcaino_remote_presentation_082014
AHUPO_Vizcaino_remote_presentation_082014
 
Dynamic linkage of public proteomics data in Ensembl using TrackHubs
Dynamic linkage of public proteomics data in Ensembl using TrackHubsDynamic linkage of public proteomics data in Ensembl using TrackHubs
Dynamic linkage of public proteomics data in Ensembl using TrackHubs
 
ELIXIR Pilot Actions launched in 2014: Integration of BILS-ProteomeXchange us...
ELIXIR Pilot Actions launched in 2014: Integration of BILS-ProteomeXchange us...ELIXIR Pilot Actions launched in 2014: Integration of BILS-ProteomeXchange us...
ELIXIR Pilot Actions launched in 2014: Integration of BILS-ProteomeXchange us...
 
ProteomeXchange: data deposition and data retrieval made easy
ProteomeXchange: data deposition and data retrieval made easyProteomeXchange: data deposition and data retrieval made easy
ProteomeXchange: data deposition and data retrieval made easy
 
Mass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progressMass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progress
 
Euro lipids 2014_graz
Euro lipids 2014_grazEuro lipids 2014_graz
Euro lipids 2014_graz
 
PRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinarPRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinar
 
Data Independent Analysis on Thermo Scientific Orbitrap MS Systems
Data Independent Analysis on Thermo Scientific Orbitrap MS SystemsData Independent Analysis on Thermo Scientific Orbitrap MS Systems
Data Independent Analysis on Thermo Scientific Orbitrap MS Systems
 
New DIA Workflows for Ultimate Flexibility in LCMS Proteomics
New DIA Workflows for Ultimate Flexibility in LCMS ProteomicsNew DIA Workflows for Ultimate Flexibility in LCMS Proteomics
New DIA Workflows for Ultimate Flexibility in LCMS Proteomics
 
The mzTab data standard format for reporting MS-based peptide, protein and sm...
The mzTab data standard format for reporting MS-based peptide, protein and sm...The mzTab data standard format for reporting MS-based peptide, protein and sm...
The mzTab data standard format for reporting MS-based peptide, protein and sm...
 
Human microbiome project
Human microbiome projectHuman microbiome project
Human microbiome project
 

Similar to Share and explore public proteomics datasets

An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...Juan Antonio Vizcaino
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...Juan Antonio Vizcaino
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Juan Antonio Vizcaino
 
Experiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics fieldExperiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics fieldJuan Antonio Vizcaino
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Juan Antonio Vizcaino
 
Mining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsMining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsJuan Antonio Vizcaino
 
Proteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsProteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsJuan Antonio Vizcaino
 
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusablePRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable Yasset Perez-Riverol
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formatsJuan Antonio Vizcaino
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...David Peyruc
 

Similar to Share and explore public proteomics datasets (20)

PRIDE and ProteomeXchange
PRIDE and ProteomeXchangePRIDE and ProteomeXchange
PRIDE and ProteomeXchange
 
PRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchangePRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchange
 
An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
 
Experiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics fieldExperiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics field
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
 
Mining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsMining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasets
 
PSI-Proteome Informatics update
PSI-Proteome Informatics updatePSI-Proteome Informatics update
PSI-Proteome Informatics update
 
Proteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsProteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomics
 
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusablePRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formats
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: transmart’s application t...
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
 
ProteomeXchange update HUPO 2016
ProteomeXchange update HUPO 2016ProteomeXchange update HUPO 2016
ProteomeXchange update HUPO 2016
 
ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
 

More from Juan Antonio Vizcaino

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Juan Antonio Vizcaino
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...Juan Antonio Vizcaino
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...Juan Antonio Vizcaino
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateJuan Antonio Vizcaino
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Juan Antonio Vizcaino
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Juan Antonio Vizcaino
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Juan Antonio Vizcaino
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataJuan Antonio Vizcaino
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...Juan Antonio Vizcaino
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataJuan Antonio Vizcaino
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRJuan Antonio Vizcaino
 
The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)Juan Antonio Vizcaino
 

More from Juan Antonio Vizcaino (20)

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
 
The ELIXIR Proteomics community
The ELIXIR Proteomics community The ELIXIR Proteomics community
The ELIXIR Proteomics community
 
The ELIXIR Proteomics Community
The ELIXIR Proteomics CommunityThe ELIXIR Proteomics Community
The ELIXIR Proteomics Community
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 update
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
 
ProteomeXchange update 2017
ProteomeXchange update 2017ProteomeXchange update 2017
ProteomeXchange update 2017
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics data
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIR
 
The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)
 

Recently uploaded

Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 

Recently uploaded (20)

Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 

Share and explore public proteomics datasets

  • 1. PRIDE and ProteomeXchange: Share and explore public proteomics datasets like never before Dr. Juan Antonio Vizcaíno PRIDE Group Coordinator Proteomics Services Team EMBL-EBI Hinxton, Cambridge, UK
  • 2. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 • PRIDE Archive (in the context of ProteomeXchange and the PSI standards) • How to submit data to PRIDE: PRIDE tools • ProteomeCentral, submission and access stats Overview
  • 3. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Data sharing in Proteomics • Public availability of data in proteomics enables: • Reinterpretation (e.g. data reprocessing with different aims) • Improved analysis software. • Change in protein sequence databases (e.g. proteogenomics studies). • Consider new post-translational modifications. • validation of the experimental results reported. • Specific use cases for proteomics: spectral libraries, fragmentation models, SRM transitions,…
  • 4. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PRIDE (PRoteomics IDEntifications) database http://www.ebi.ac.uk/pride • PRIDE stores mass spectrometry (MS)- based proteomics data: • Peptide and protein expression data (identification and quantification) • Post-translational modifications • Mass spectra (raw data and peak lists) • Technical and biological metadata • Any other related information • Full support for tandem MS approaches Martens et al., Proteomics, 2005 Vizcaíno et al., NAR, 2013
  • 5. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PRIDE Archive • New PRIDE DB archival system from 01/2014. Three iterations released so far. Still work in progress. • Very flexible, its development has happened in parallel with: • Implementation of ProteomeXchange. • New community PSI data standards: mzIdentML, mzQuantML and mzTab.
  • 6. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 ProteomeXchange Consortium • Goal: Development of a framework to allow standard data submission and dissemination pipelines between the main existing proteomics repositories. • Includes PeptideAtlas (ISB, Seattle), PRIDE (Cambridge, UK) and MassIVE (UCSD, San Diego). • Tranche and Peptidome initially included but discontinued. • Common identifier space (PXD identifiers) • Two supported data workflows: MS/MS and SRM. • Main objective: Make life easier for researchers http://www.proteomexchange.org Vizcaíno et al., Nat Biotechnol, 2014
  • 7. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Current PSI Standard File Formats for MS • mzTab (Griss et al., MCP, 2014)Final Results • TraML (Deutsch et al., MCP, 2012)SRM • mzQuantML (Walter et al., MCP, 2013)Quantitation • mzIdentML (Jones et al., MCP, 2012)Identification • mzML (Martens et al., MCP, 2011)MS data
  • 8. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Current PSI Standard File Formats for MS • mzTab (Griss et al., MCP, 2014)Final Results • TraML (Deutsch et al., MCP, 2012)SRM • mzQuantML (Walter et al., MCP, 2013)Quantitation • mzIdentML (Jones et al., MCP, 2012)Identification • mzML (Martens et al., MCP, 2011)MS data
  • 9. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 mzTab format: tab delimited format (ident/quant) http://code.google.com/p/mztab/ J. Griss et al., MCP, 2014 Q.W. Xu et al., Proteomics, 2014
  • 10. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Ways to access data in PRIDE Archive • PRIDE web interface • File repository • REST web service • PRIDE Inspector tool
  • 11. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PRIDE Archive web interface
  • 12. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PRIDE Archive web interface (2) • Next: visualization of spectra (in a couple of weeks)
  • 13. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Programmatic access: PRIDE REST web service http://www.ebi.ac.uk/pride/ws/archive/ • Intending to replace the most popular functionality provided by the PRIDE Biomart interface (now discontinued)
  • 14. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 • Introduction to PRIDE Archive (in the context of ProteomeXchange and the PSI standards) • How to submit data to PRIDE: PRIDE tools • ProteomeCentral, submission and access stats • A sneak peak about data reuse Overview
  • 15. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 ProteomeCentral Metadata / Manuscript Raw Data* Results Journals UniProt/ neXtProt Peptide Atlas Other DBs Receiving repositories PASSEL (SRM data) PRIDE (MS/MS data) Other DBs GPMDB Researcher’s results Reprocessed results Raw data* Metadata MassIVE (MS/MS data) ProteomeXchange data workflow: PRIDE
  • 16. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Manuscript published detailing the process Ternent et al., Proteomics, 2014http://www.proteomexchange.org/submission Example dataset: PXD000764 - Title: “Discovery of new CSF biomarkers for meningitis in children” - 12 runs: 4 controls and 8 infected samples - Identification and quantification data
  • 17. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PX Data workflow for MS/MS data 1. Mass spectrometer output files: raw data (binary files) or peak list spectra in a standardized format (mzML, mzXML). 2. Result files: a. Complete submissions: Result files can be converted to PRIDE XML or the mzIdentML data standard. b. Partial submissions: For workflows not yet supported by PRIDE, search engine output files will be stored and provided in their original form. 3. Metadata: Sufficiently detailed description of sample origin, workflow, instrumentation, submitter. 4. Other files: Optional files: a. QUANT: Quantification related results e. FASTA b. PEAK: Peak list files f. SP_LIBRARY c. GEL: Gel images d. OTHER: Any other file type Published Raw Files Other files
  • 18. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Complete vs Partial submissions: experimental metadata Complete Partial General experimental metadata about the projects is similar. However, at the assay level information in partial submissions is not so detailed
  • 19. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Complete Partial Complete vs Partial submissions: processed results For complete submissions, it is possible to connect the spectra with the identification processed results and they can be visualized.
  • 20. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PX Data workflow for MS/MS data 1. Mass spectrometer output files: raw data (binary files) or peak list spectra in a standardized format (mzML, mzXML). 2. Result files: a. Complete submissions: Result files can be converted to PRIDE XML or the mzIdentML data standard. b. Partial submissions: For workflows not yet supported by PRIDE, search engine output files will be stored and provided in their original form. 3. Metadata: Sufficiently detailed description of sample origin, workflow, instrumentation, submitter. 4. Other files: Optional files (the list can be extended): a. QUANT: Quantification related results e. FASTA b. PEAK: Peak list files f. SP_LIBRARY c. GEL: Gel images d. OTHER: Any other file type Published Raw Files Other files
  • 21. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PRIDE Components: Submission Process PRIDE Converter 2 PRIDE Inspector PX Submission Tool mzIdentML PRIDE XML 1
  • 22. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Search output files Spectra files Original data files ‘RESULT’ file generation Final ‘RESULT’ file PRIDE XML ‘RESULT’ Before: only file conversion to PRIDE XML File conversion PRIDE Converter
  • 23. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Tools ‘RESULT’ file generation Final ‘RESULT’ file mzIdentML ‘RESULT’ Now: native file export Spectra files Mascot ProteinPilot Scaffold PEAKS MSGF+ Others Native File export
  • 24. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Complete submissions Search Engine Results + MS files Search engines mzIdentML - Mascot - MSGF+ - Myrimatch and related tools from D. Tabb’s lab - OpenMS - PEAKS - PeptideShaker - ProCon (ProteomeDiscoverer, Sequest) - Scaffold - TPP via the idConvert tool (ProteoWizard) - ProteinPilot (from version 5.0) - Others: library for X!Tandem conversion, lab internal pipelines, … - Crux An increasing number of tools support export to mzIdentML 1.1 - Referenced spectral files need to be submitted as well (all open formats are supported). Updated list: http://www.psidev.info/tools-implementing-mzIdentML#.
  • 25. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Tools ‘RESULT’ file generation Final ‘RESULT’ file mzTab ‘RESULT’ In the near future: native file export Spectra files Mascot ProteinPilot Scaffold PEAKS MSGF+ Others Native File export
  • 26. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PRIDE Components: Submission Process PRIDE Converter 2 PRIDE Inspector PX Submission Tool mzIdentML PRIDE XML 2
  • 27. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PRIDE Inspector 2 Wang et al., Nat. Biotechnology, 2012 PRIDE Inspector 2 PRIDE Inspector 2 supports: - PRIDE XML - mzIdentML + all types of spectra files - mzML - mzTab Quantitation (work in progress) https://github.com/PRIDE-Toolsuite/
  • 28. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PRIDE Inspector 2 PRIDE Inspector 2 https://github.com/PRIDE-Toolsuite/ New visualisation functionality for Protein Groups
  • 29. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PRIDE Inspector 2 PRIDE Inspector 2 Private review of files submitted to PRIDE https://github.com/PRIDE-Toolsuite/
  • 30. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PRIDE Components: Submission Process PRIDE Converter 2 PRIDE Inspector PX Submission Tool mzIdentML PRIDE XML 3
  • 31. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 • It selects and captures the mappings between the different types of files included in the submission. • It transfers all the files using Aspera (default) or FTP. PX submission tool Published Raw Other files http://www.proteomexchange.org/submission PX submission tool • Command line alternative: some scripting is needed
  • 32. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PX submission tool: screenshots Step 3 Step 4
  • 33. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Fast file transfer with Aspera - Aspera is the default file transfer protocol to PRIDE: - PX Submission tool - Command line - Up to 50X faster than FTP File transfer speed should not be a problem!! - Also now available for downloading files
  • 34. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Partial submissions can be used to store other data workflows • Everything can be stored, not only MS/MS data (~90% of datasets): very flexible mechanism to be able to capture all types of datasets • PRIDE does not store SRM data (it goes to PASSEL) • Top down proteomics datasets (10 datasets). • Mass Spectrometry Imaging datasets (1 dataset). • Data independent acquisition techniques: e.g. SWATH-MS (9 datasets), HDMSE (1 dataset).
  • 35. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 C D From original publication [13] Reconstructed ProteomeXchange data 1. Thermo RAW data / UDP 2. Mirion Software (JLU) 1. Thermo RAW data / UDP 2. Convert to imzML 3. Upload to PRIDE (EBI, Cambridge, UK) 4. Download from PRIDE 5. Display in MSiReader - Vendor-independent data format - Freely available software (open source) - ‘open data‘ – free to reuse - Anybody can do this! Römpp et al., 2014, Anal Bioanal Chem, in press PRIDE database European Bioinformatics Institute, Cambridge, UK 3. Upload 4. Download
  • 36. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 • Introduction to PRIDE Archive (in the context of ProteomeXchange and the PSI standards) • How to submit data to PRIDE: PRIDE tools • ProteomeCentral, submission and access stats Overview
  • 37. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 ProteomeCentral Metadata / Manuscript Raw Data* Results Journals UniProt/ neXtProt Peptide Atlas Other DBs Receiving repositories PASSEL (SRM data) PRIDE (MS/MS data) Other DBs GPMDB Researcher’s results Reprocessed results Raw data* Metadata MassIVE (MS/MS data) ProteomeXchange data workflow: ProteomeCentral
  • 38. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 ProteomeCentral: Portal for all PX datasets http://proteomecentral.proteomexchange.org/cgi/GetDataset
  • 39. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 ProteomeCentral: Portal for all PX datasets http://proteomecentral.proteomexchange.org/cgi/GetDataset
  • 40. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 RSS feed for public datasets
  • 41. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Origin: 322 USA 197 Germany 148 United Kingdom 91 Netherlands 85 France 81 China 80 Switzerland 61 Canada 48 Belgium 47 Spain 45 Denmark 42 Australia 40 Japan 37 Sweden 28 Austria 22 India 21 Norway 21 Taiwan 20 Ireland 20 Finland 17 Italy 14 Brazil 13 Republic of Korea 13 Russia 10 Israel 9 Singapore … ProteomeXchange: 1620 datasets up until 8th January 2015 Type: 526 PRIDE complete (32.5%) 982 PRIDE partial (60.6%) 63 PeptideAtlas/PASSEL complete 24 MassIVE 25 reprocessed Publicly Accessible: 814 datasets, 50% of all 90% PRIDE 8% PASSEL 2% MassIVE Data volume: Total: ~71 TB Number of all files: ~160,000 PXD000320-324: ~ 5 TB PXD000065: ~ 1.4TB Top Species studied by at least 10 datasets: 712 Homo sapiens 193 Mus musculus 65 Saccharomyces cerevisiae 61 Arabidopsis thaliana 35 Rattus norvegicus 34 Escherichia coli 17 Bos taurus 17 Glycine max 17 Mycobacterium tuberculosis 16 Drosophila melanogaster 14 Oryza sativa ~ 310 species in total Datasets/year: 2012: 102 2013: 527 2014: 963 2015: 28
  • 42. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 PRIDE: Submitted datasets per month
  • 43. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Access statistics: PRIDE File repository 2014: The rise of proteomics data re-use
  • 44. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Which are the most accessed datasets?
  • 45. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Vaudel M, Barsnes H, Berven FS, Sickmann A, Martens L: Proteomics 2011;11(5):996-9. http://searchgui.googlecode.com http://peptide-shaker.googlecode.com Vaudel M, Burkhart J, Zahedi RP, Berven FS, Sickmann A, Martens L, Barsnes H: Nature Biotechnology 2015; 33(1):22-4. PeptideShaker facilitates reuse of PRIDE data
  • 46. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Draft Human proteome papers published in 2014 Wilhelm et al., Nature, 2014 •Around 60% of the data used for the analysis comes from previous experiments, most of them stored in proteomics repositories such as PRIDE/ProteomeXchange, PASSEL or MassIVE. •They complement that data with “exotic” tissues.
  • 47. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 • Data submission and data reuse in the field are rising. • PRIDE and ProteomeXchange enable this for you. • Data standards are key for us. • Quantification data depends on mzTab support. Conclusions
  • 48. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Aknowledgements: People Attila Csordas Tobias Ternent Noemi del Toro Rui Wang Florian Reisinger Jose A. Dianes Johannes Griss Steven Lewis Yasset Perez-Riverol Henning Hermjakob All ProteomeXchange partners, especially Eric Deutsch and Nuno Bandeira Acknowledgements: The PRIDE Team and collaborators
  • 49. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Acknowledgements: Funding pride-ebi@ebi.ac.uk pride-support@ebi.ac.uk http://www.proteomexchange.org http://code.google.com/p/pride-converter-2/ @pride_ebi Acknowledgements
  • 50. Juan A. Vizcaíno juan@ebi.ac.uk Midwinter Proteomics Bioinformatics Seminar Semmering, 15 January 2015 Questions?