SlideShare a Scribd company logo
The ProteomeXchange Consortium: 2016
update
Dr. Juan Antonio Vizcaíno
Proteomics Team Leader
EMBL-European Bioinformatics Institute
Hinxton, Cambridge, UK
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
PSI Spring Meeting 2017
Beijing Proteome Research Center, China
April 24-26, 2017
April 23: 2nd PHOENIX Mini-Symposium
on Frontiers of Proteomics
April 27: Hiking the Great Wall
Focus topics:
• Quality control: qcML
• Proteogenomics formats
• proXI: proteomics eXpression Interface
• Privacy and Proteomics Data
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
Overview
• General introduction to ProteomeXchange
• Overall submission statistics
• Updated HPP guidelines
• Specifics about MassIVE (Nuno)
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
ProteomeXchange: A Global, distributed proteomics
database
PASSEL
(SRM data)
PRIDE
(MS/MS data)
MassIVE
(MS/MS data)
Raw
ID/Q
Meta
Mandatory raw data deposition
since July 2015
• Goal: Development of a framework to allow standard data submission and
dissemination pipelines between the main existing proteomics repositories.
http://www.proteomexchange.org
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
ProteomeXchange: A Global, distributed proteomics
database
PASSEL
(SRM data)
PRIDE
(MS/MS data)
MassIVE
(MS/MS data)
Raw
ID/Q
Meta
jPOST
(MS/MS data)
Mandatory raw data deposition
since July 2015
• Goal: Development of a framework to allow standard data submission and
dissemination pipelines between the main existing proteomics repositories.
http://www.proteomexchange.org
New in 2016
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
ProteomeCentral
Metadata /
Manuscript
Raw Data
Results
Journals
Peptide Atlas
Receiving repositories
PRIDE
Researcher’s results
Raw data
Metadata
PASSEL
Research
groups
Reanalysis of datasets
MassIVE
jPOST
MS/MS
data
(as complete
submissions)
Any other
workflow
(mainly partial
submissions)
DATASETS
SRM
data
Reprocessed results
MassIVE
ProteomeXchange data workflow
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
ProteomeCentral: Centralised portal for all PX
datasets
http://proteomecentral.proteomexchange.org/cgi/GetDataset
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
ProteomeCentral
Metadata /
Manuscript
Raw Data
Results
Journals
Peptide Atlas
Receiving repositories
PRIDE
Researcher’s results
Raw data
Metadata
PASSEL
Research
groups
Reanalysis of datasets
MassIVE
jPOST
MS/MS
data
(as complete
submissions)
Any other
workflow
(mainly partial
submissions)
DATASETS
SRM
data
Reprocessed results
MassIVE
ProteomeXchange data workflow
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
ProteomeCentral
Metadata /
Manuscript
Raw Data
Results
Journals
UniProt/
neXtProtPeptide Atlas
Other DBs
Receiving repositories
PRIDE
GPMDBResearcher’s results
Raw data
Metadata
PASSEL
proteomicsDB
Research
groups
Reanalysis of datasets
MassIVE
jPOST
MS/MS
data
(as complete
submissions)
Any other
workflow
(mainly partial
submissions)
DATASETS
OmicsDI
Integration with other
omics datasets
SRM
data
Reprocessed results
MassIVE
ProteomeXchange data workflow
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
OmicsDI: Portal for omics datasets
http://www.ebi.ac.uk/Tools/omicsdi/
• Aims to integrate of ‘omics’ datasets (proteomics,
transcriptomics, metabolomics and genomics at present).
PRIDE
MassIVE
jPOST
PASSEL
GPMDB
ArrayExpress
Expression Atlas
MetaboLights
Metabolomics Workbench
GNPS
EGA
Perez-Riverol et al., 2016, BioRXxiv
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
OmicsDI: Portal for omics datasets
Perez-Riverol et al., 2016, BioRXxiv
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
Overview
• General introduction to ProteomeXchange
• Overall submission statistics
• Updated HPP guidelines
• Specifics about MassIVE (Nuno)
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
Countries with at least 100
datasets:
1105 USA
546 Germany
411 United Kingdom
356 China
229 France
188 Netherlands
178 Canada
150 Switzerland
125 Australia
123 Spain
123 Denmark
117 Japan
101 Sweden
ProteomeXchange: 4,534 datasets up until 31st July, 2016
Type:
4067 PRIDE
339 MassIVE
115 PeptideAtlas/PASSEL
13 jPOST
Publicly Accessible:
2597 datasets, 57% of all
2334 PRIDE
135 MassIVE
115 PASSEL
13 jPOST
Datasets/year:
2012: 102
2013: 527
2014: 963
2015: 1758
2016 (till end of July): 1184
Top Species studied by at least 100
datasets:
2010 Homo sapiens
604 Mus musculus
191 Saccharomyces cerevisiae
140 Arabidopsis thaliana
127 Rattus norvegicus
936 reported taxa in total
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
Datasets are being reused more and more….
Data download volume for PRIDE in 2015: ~ 200 TB
Vaudel et al., Proteomics, 2016
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
Overview
• General introduction to ProteomeXchange
• Overall submission statistics
• Updated HPP guidelines
• Specifics about MassIVE (Nuno)
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
HPP guidelines version 2.1
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
Complete
Partial
Complete vs Partial submissions: processed results
For complete submissions, it is possible to connect the spectra with the identification
processed results and they can be visualized.
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
Complete vs Partial submissions: experimental metadata
Complete Partial
General experimental metadata about the projects is similar.
However, at the assay level information in partial submissions is not so detailed
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
An observer of ProteomeXchange consortium - iProX
• Proteome data sharing platform in China
• Focusing
• Collection and sharing of proteome experiment raw data
• Standardized metadata of proteome experiment
• Visualization of proteome dataset
• Providing
• A User friendly data submission pipeline
• Structured management of datasets
• An effective user authority system
• Standardized metadata collection
• Powerful computing, storage, and network resources to support the pipeline
• Remote data backup and synchronous update
www.iprox.org
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
Overview
• General introduction to ProteomeXchange
• Overall submission statistics
• Updated HPP guidelines
• Specifics about MassIVE (Nuno)
MassIVE update
Mingxun Wang1,2,4, Jeremy Carver1,4, Nuno Bandeira1-4
1Center for Computational Mass Spectrometry
2Computer Science and Engineering
3Skaggs School of Pharmacy and Pharmaceutical Sciences
4University of California, San Diego
Center for
Computational
Mass
Spectrometry
http://massive.ucsd.edu
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
http://massive.ucsd.eduhttp://proteomics.ucsd.edu
MassIVE Interactivity
• MassIVE = Mass spectrometry Interactive Virtual Environment
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
Massive reanalysis
• Community knowledge requires reproducible, well-characterized results
• MS-GF+ standard database search
• Reanalyzed 15 TB of Human data with ~185M MS/MS spectra
• 79 million new FDR-controlled PSMs
• 3.6 million modified versions of 2.8 million unique peptide sequences
• CPTAC colon cancer available with 5 different results sets
• [Original] Imported CPTAC results: 6.9M PSMs
• [Reanalysis] MS-GF+ database search: 8.9M PSMs, 70k mod variants (169k total)
• [Reanalysis] Spectral library search (MSPLIT): 10M PSMs, including 387K mixture spectra
• [Reanalysis] Proteogenomics searches of TCGA transcriptomics sequences (Enosi): 6.8M total
PSMs, 19,728 proteogenomic events
• [Reanalysis] Blind modification search (MODa): 7.8M PSMs, 2.8M PSMs for 221k mod variants
(306k total), 203K new mod variants (unique modified peptides)
http://massive.ucsd.eduhttp://proteomics.ucsd.edu
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
Massive: Do it yourself
1. MSGF+ - Database search engine
2. MSPLIT – Spectral Library Search Engine
3. ENOSI – ProteoGenomic Search Engine
4. MODa - Multi-blind modification database search engine
5. Spectral Networks – spectral alignment-based
analysis and propagation of identifications
6. Multi-pass - MSPLIT, MSGFDB, MODa cascade Search
Workflow
7. MSGFDB - Database search engine
8. MSPLIT-DIA – Spectral Library Search for SWATH
9. Upload your own! (mzIdentML, mzTab, TSV)
http://massive.ucsd.eduhttp://proteomics.ucsd.edu
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
Check what others think the spectrum is –
Massive Search
 Find peptide, proteins, PTMs
 Agreement in spectrum
identification?
One-stop search
across tens of
millions of PSMs
 Original
 Reanalysis
http://massive.ucsd.eduhttp://proteomics.ucsd.edu
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
What can you do?
• How can the community work together to reveal the whole human proteome?
• Mass spectrometrists  share Data
• At least: partial submissions with raw mass spectrometry data and enough metadata to
allow for reanalysis
• Especially useful: rare tissues/conditions or very deep acquisition
• Biologists  share Knowledge
• At least: complete submissions with FDR-filtered results in open format (mzIdentML or
mzTab)
• Especially useful: human-curated knowledge of proteins, PTMs, endogenous peptides,
etc
• Bioinformaticians  share Reanalyses
• At least: FDR-filtered results in open format (mzIdentML or mzTab)
• Especially useful: algorithms that identify new types of PSMs (e.g., PTM-specific,
mixtures)
http://massive.ucsd.eduhttp://proteomics.ucsd.edu
Juan A. Vizcaíno
juan@ebi.ac.uk
HUPO 2016 World Conference
Taipei, 20 September 2016
Aknowledgements: People
Attila Csordas
Tobias Ternent
Gerhard Mayer (de.NBI)
Yasset Perez-Riverol
Manuel Bernal-Llinares
Andrew Jarnuczak
Former team members, especially:
Rui Wang
Florian Reisinger
Noemi del Toro
Jose A. Dianes
Henning Hermjakob
Acknowledgements: The PRIDE Team and all PX partners
All data submitters !!!
Eric Deutsch
Zhi Sun
David Campbell
Nuno Bandeira
Mingxun Wang
Jeremy Carver
Yasushi Ishihama
Shujiro Okuda
Shin Kawano
Follow new datasets @proteomexchange

More Related Content

What's hot

Pride cluster presentation
Pride cluster presentation Pride cluster presentation
Pride cluster presentation
Juan Antonio Vizcaino
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Juan Antonio Vizcaino
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 update
Juan Antonio Vizcaino
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
Juan Antonio Vizcaino
 
Small molecule identification and the new MassBank
Small molecule identification and the new MassBankSmall molecule identification and the new MassBank
Small molecule identification and the new MassBank
Steffen Neumann
 
Mass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progressMass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progress
Juan Antonio Vizcaino
 
Bio2RDF @ DILS 2008
Bio2RDF @ DILS 2008Bio2RDF @ DILS 2008
Bio2RDF @ DILS 2008
François Belleau
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
Juan Antonio Vizcaino
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
Juan Antonio Vizcaino
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
Juan Antonio Vizcaino
 
PRIDE-ProteomeXchange
PRIDE-ProteomeXchangePRIDE-ProteomeXchange
PRIDE-ProteomeXchange
Juan Antonio Vizcaino
 
BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...
BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...
BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...
Lisette Giepmans
 
The eNanoMapper database for nanomaterial safety information: storage and query
The eNanoMapper database for nanomaterial safety information: storage and queryThe eNanoMapper database for nanomaterial safety information: storage and query
The eNanoMapper database for nanomaterial safety information: storage and query
Nina Jeliazkova
 
ASMS Fall 2018 Metabolomics Informatics Workshop Peak Picking
ASMS Fall 2018 Metabolomics Informatics Workshop Peak PickingASMS Fall 2018 Metabolomics Informatics Workshop Peak Picking
ASMS Fall 2018 Metabolomics Informatics Workshop Peak Picking
Emma Schymanski
 

What's hot (14)

Pride cluster presentation
Pride cluster presentation Pride cluster presentation
Pride cluster presentation
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 update
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
 
Small molecule identification and the new MassBank
Small molecule identification and the new MassBankSmall molecule identification and the new MassBank
Small molecule identification and the new MassBank
 
Mass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progressMass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progress
 
Bio2RDF @ DILS 2008
Bio2RDF @ DILS 2008Bio2RDF @ DILS 2008
Bio2RDF @ DILS 2008
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
PRIDE-ProteomeXchange
PRIDE-ProteomeXchangePRIDE-ProteomeXchange
PRIDE-ProteomeXchange
 
BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...
BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...
BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...
 
The eNanoMapper database for nanomaterial safety information: storage and query
The eNanoMapper database for nanomaterial safety information: storage and queryThe eNanoMapper database for nanomaterial safety information: storage and query
The eNanoMapper database for nanomaterial safety information: storage and query
 
ASMS Fall 2018 Metabolomics Informatics Workshop Peak Picking
ASMS Fall 2018 Metabolomics Informatics Workshop Peak PickingASMS Fall 2018 Metabolomics Informatics Workshop Peak Picking
ASMS Fall 2018 Metabolomics Informatics Workshop Peak Picking
 

Similar to ProteomeXchange update HUPO 2016

Proteomexchange
ProteomexchangeProteomexchange
Proteomexchange
Juan Antonio Vizcaino
 
Pride and ProteomeXchange
Pride and ProteomeXchangePride and ProteomeXchange
Pride and ProteomeXchange
Juan Antonio Vizcaino
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
Juan Antonio Vizcaino
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
Juan Antonio Vizcaino
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Juan Antonio Vizcaino
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
Juan Antonio Vizcaino
 
Proteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsProteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomics
Juan Antonio Vizcaino
 
Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...
Juan Antonio Vizcaino
 
ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
Juan Antonio Vizcaino
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?
Juan Antonio Vizcaino
 
Pride Cluster 062016 Update
Pride Cluster 062016 UpdatePride Cluster 062016 Update
Pride Cluster 062016 Update
Juan Antonio Vizcaino
 
ProteomeXchange update 2017
ProteomeXchange update 2017ProteomeXchange update 2017
ProteomeXchange update 2017
Juan Antonio Vizcaino
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
Juan Antonio Vizcaino
 
ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
Juan Antonio Vizcaino
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics data
Juan Antonio Vizcaino
 
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeData volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
Juan Antonio Vizcaino
 
PRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinarPRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinar
Juan Antonio Vizcaino
 
Reuse of public data in proteomics
Reuse of public data in proteomicsReuse of public data in proteomics
Reuse of public data in proteomics
Juan Antonio Vizcaino
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?
Juan Antonio Vizcaino
 
Dynamic linkage of public proteomics data in Ensembl using TrackHubs
Dynamic linkage of public proteomics data in Ensembl using TrackHubsDynamic linkage of public proteomics data in Ensembl using TrackHubs
Dynamic linkage of public proteomics data in Ensembl using TrackHubs
Juan Antonio Vizcaino
 

Similar to ProteomeXchange update HUPO 2016 (20)

Proteomexchange
ProteomexchangeProteomexchange
Proteomexchange
 
Pride and ProteomeXchange
Pride and ProteomeXchangePride and ProteomeXchange
Pride and ProteomeXchange
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Proteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsProteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomics
 
Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...
 
ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?
 
Pride Cluster 062016 Update
Pride Cluster 062016 UpdatePride Cluster 062016 Update
Pride Cluster 062016 Update
 
ProteomeXchange update 2017
ProteomeXchange update 2017ProteomeXchange update 2017
ProteomeXchange update 2017
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics data
 
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeData volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
 
PRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinarPRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinar
 
Reuse of public data in proteomics
Reuse of public data in proteomicsReuse of public data in proteomics
Reuse of public data in proteomics
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?
 
Dynamic linkage of public proteomics data in Ensembl using TrackHubs
Dynamic linkage of public proteomics data in Ensembl using TrackHubsDynamic linkage of public proteomics data in Ensembl using TrackHubs
Dynamic linkage of public proteomics data in Ensembl using TrackHubs
 

More from Juan Antonio Vizcaino

Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formats
Juan Antonio Vizcaino
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
Juan Antonio Vizcaino
 
PRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchangePRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchange
Juan Antonio Vizcaino
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018
Juan Antonio Vizcaino
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
Juan Antonio Vizcaino
 
PSI-Proteome Informatics update
PSI-Proteome Informatics updatePSI-Proteome Informatics update
PSI-Proteome Informatics update
Juan Antonio Vizcaino
 
The ELIXIR Proteomics community
The ELIXIR Proteomics community The ELIXIR Proteomics community
The ELIXIR Proteomics community
Juan Antonio Vizcaino
 
The ELIXIR Proteomics Community
The ELIXIR Proteomics CommunityThe ELIXIR Proteomics Community
The ELIXIR Proteomics Community
Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017
Juan Antonio Vizcaino
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
Juan Antonio Vizcaino
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIR
Juan Antonio Vizcaino
 
The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)
Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016
Juan Antonio Vizcaino
 

More from Juan Antonio Vizcaino (14)

Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formats
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
PRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchangePRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchange
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
 
PSI-Proteome Informatics update
PSI-Proteome Informatics updatePSI-Proteome Informatics update
PSI-Proteome Informatics update
 
The ELIXIR Proteomics community
The ELIXIR Proteomics community The ELIXIR Proteomics community
The ELIXIR Proteomics community
 
The ELIXIR Proteomics Community
The ELIXIR Proteomics CommunityThe ELIXIR Proteomics Community
The ELIXIR Proteomics Community
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIR
 
The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)
 
Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016
 

Recently uploaded

在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
Katherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdfKatherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdf
Texas Alliance of Groundwater Districts
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
Sciences of Europe
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
Shashank Shekhar Pandey
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 

Recently uploaded (20)

在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
Katherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdfKatherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdf
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 

ProteomeXchange update HUPO 2016

  • 1. The ProteomeXchange Consortium: 2016 update Dr. Juan Antonio Vizcaíno Proteomics Team Leader EMBL-European Bioinformatics Institute Hinxton, Cambridge, UK
  • 2. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 PSI Spring Meeting 2017 Beijing Proteome Research Center, China April 24-26, 2017 April 23: 2nd PHOENIX Mini-Symposium on Frontiers of Proteomics April 27: Hiking the Great Wall Focus topics: • Quality control: qcML • Proteogenomics formats • proXI: proteomics eXpression Interface • Privacy and Proteomics Data
  • 3. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 Overview • General introduction to ProteomeXchange • Overall submission statistics • Updated HPP guidelines • Specifics about MassIVE (Nuno)
  • 4. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 ProteomeXchange: A Global, distributed proteomics database PASSEL (SRM data) PRIDE (MS/MS data) MassIVE (MS/MS data) Raw ID/Q Meta Mandatory raw data deposition since July 2015 • Goal: Development of a framework to allow standard data submission and dissemination pipelines between the main existing proteomics repositories. http://www.proteomexchange.org
  • 5. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 ProteomeXchange: A Global, distributed proteomics database PASSEL (SRM data) PRIDE (MS/MS data) MassIVE (MS/MS data) Raw ID/Q Meta jPOST (MS/MS data) Mandatory raw data deposition since July 2015 • Goal: Development of a framework to allow standard data submission and dissemination pipelines between the main existing proteomics repositories. http://www.proteomexchange.org New in 2016
  • 6. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 ProteomeCentral Metadata / Manuscript Raw Data Results Journals Peptide Atlas Receiving repositories PRIDE Researcher’s results Raw data Metadata PASSEL Research groups Reanalysis of datasets MassIVE jPOST MS/MS data (as complete submissions) Any other workflow (mainly partial submissions) DATASETS SRM data Reprocessed results MassIVE ProteomeXchange data workflow
  • 7. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 ProteomeCentral: Centralised portal for all PX datasets http://proteomecentral.proteomexchange.org/cgi/GetDataset
  • 8. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 ProteomeCentral Metadata / Manuscript Raw Data Results Journals Peptide Atlas Receiving repositories PRIDE Researcher’s results Raw data Metadata PASSEL Research groups Reanalysis of datasets MassIVE jPOST MS/MS data (as complete submissions) Any other workflow (mainly partial submissions) DATASETS SRM data Reprocessed results MassIVE ProteomeXchange data workflow
  • 9. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 ProteomeCentral Metadata / Manuscript Raw Data Results Journals UniProt/ neXtProtPeptide Atlas Other DBs Receiving repositories PRIDE GPMDBResearcher’s results Raw data Metadata PASSEL proteomicsDB Research groups Reanalysis of datasets MassIVE jPOST MS/MS data (as complete submissions) Any other workflow (mainly partial submissions) DATASETS OmicsDI Integration with other omics datasets SRM data Reprocessed results MassIVE ProteomeXchange data workflow
  • 10. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 OmicsDI: Portal for omics datasets http://www.ebi.ac.uk/Tools/omicsdi/ • Aims to integrate of ‘omics’ datasets (proteomics, transcriptomics, metabolomics and genomics at present). PRIDE MassIVE jPOST PASSEL GPMDB ArrayExpress Expression Atlas MetaboLights Metabolomics Workbench GNPS EGA Perez-Riverol et al., 2016, BioRXxiv
  • 11. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 OmicsDI: Portal for omics datasets Perez-Riverol et al., 2016, BioRXxiv
  • 12. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 Overview • General introduction to ProteomeXchange • Overall submission statistics • Updated HPP guidelines • Specifics about MassIVE (Nuno)
  • 13. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 Countries with at least 100 datasets: 1105 USA 546 Germany 411 United Kingdom 356 China 229 France 188 Netherlands 178 Canada 150 Switzerland 125 Australia 123 Spain 123 Denmark 117 Japan 101 Sweden ProteomeXchange: 4,534 datasets up until 31st July, 2016 Type: 4067 PRIDE 339 MassIVE 115 PeptideAtlas/PASSEL 13 jPOST Publicly Accessible: 2597 datasets, 57% of all 2334 PRIDE 135 MassIVE 115 PASSEL 13 jPOST Datasets/year: 2012: 102 2013: 527 2014: 963 2015: 1758 2016 (till end of July): 1184 Top Species studied by at least 100 datasets: 2010 Homo sapiens 604 Mus musculus 191 Saccharomyces cerevisiae 140 Arabidopsis thaliana 127 Rattus norvegicus 936 reported taxa in total
  • 14. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 Datasets are being reused more and more…. Data download volume for PRIDE in 2015: ~ 200 TB Vaudel et al., Proteomics, 2016
  • 15. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 Overview • General introduction to ProteomeXchange • Overall submission statistics • Updated HPP guidelines • Specifics about MassIVE (Nuno)
  • 16. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 HPP guidelines version 2.1
  • 17. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 Complete Partial Complete vs Partial submissions: processed results For complete submissions, it is possible to connect the spectra with the identification processed results and they can be visualized.
  • 18. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 Complete vs Partial submissions: experimental metadata Complete Partial General experimental metadata about the projects is similar. However, at the assay level information in partial submissions is not so detailed
  • 19. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 An observer of ProteomeXchange consortium - iProX • Proteome data sharing platform in China • Focusing • Collection and sharing of proteome experiment raw data • Standardized metadata of proteome experiment • Visualization of proteome dataset • Providing • A User friendly data submission pipeline • Structured management of datasets • An effective user authority system • Standardized metadata collection • Powerful computing, storage, and network resources to support the pipeline • Remote data backup and synchronous update www.iprox.org
  • 20. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 Overview • General introduction to ProteomeXchange • Overall submission statistics • Updated HPP guidelines • Specifics about MassIVE (Nuno)
  • 21. MassIVE update Mingxun Wang1,2,4, Jeremy Carver1,4, Nuno Bandeira1-4 1Center for Computational Mass Spectrometry 2Computer Science and Engineering 3Skaggs School of Pharmacy and Pharmaceutical Sciences 4University of California, San Diego Center for Computational Mass Spectrometry http://massive.ucsd.edu
  • 22. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 http://massive.ucsd.eduhttp://proteomics.ucsd.edu MassIVE Interactivity • MassIVE = Mass spectrometry Interactive Virtual Environment
  • 23. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 Massive reanalysis • Community knowledge requires reproducible, well-characterized results • MS-GF+ standard database search • Reanalyzed 15 TB of Human data with ~185M MS/MS spectra • 79 million new FDR-controlled PSMs • 3.6 million modified versions of 2.8 million unique peptide sequences • CPTAC colon cancer available with 5 different results sets • [Original] Imported CPTAC results: 6.9M PSMs • [Reanalysis] MS-GF+ database search: 8.9M PSMs, 70k mod variants (169k total) • [Reanalysis] Spectral library search (MSPLIT): 10M PSMs, including 387K mixture spectra • [Reanalysis] Proteogenomics searches of TCGA transcriptomics sequences (Enosi): 6.8M total PSMs, 19,728 proteogenomic events • [Reanalysis] Blind modification search (MODa): 7.8M PSMs, 2.8M PSMs for 221k mod variants (306k total), 203K new mod variants (unique modified peptides) http://massive.ucsd.eduhttp://proteomics.ucsd.edu
  • 24. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 Massive: Do it yourself 1. MSGF+ - Database search engine 2. MSPLIT – Spectral Library Search Engine 3. ENOSI – ProteoGenomic Search Engine 4. MODa - Multi-blind modification database search engine 5. Spectral Networks – spectral alignment-based analysis and propagation of identifications 6. Multi-pass - MSPLIT, MSGFDB, MODa cascade Search Workflow 7. MSGFDB - Database search engine 8. MSPLIT-DIA – Spectral Library Search for SWATH 9. Upload your own! (mzIdentML, mzTab, TSV) http://massive.ucsd.eduhttp://proteomics.ucsd.edu
  • 25. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 Check what others think the spectrum is – Massive Search  Find peptide, proteins, PTMs  Agreement in spectrum identification? One-stop search across tens of millions of PSMs  Original  Reanalysis http://massive.ucsd.eduhttp://proteomics.ucsd.edu
  • 26. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 What can you do? • How can the community work together to reveal the whole human proteome? • Mass spectrometrists  share Data • At least: partial submissions with raw mass spectrometry data and enough metadata to allow for reanalysis • Especially useful: rare tissues/conditions or very deep acquisition • Biologists  share Knowledge • At least: complete submissions with FDR-filtered results in open format (mzIdentML or mzTab) • Especially useful: human-curated knowledge of proteins, PTMs, endogenous peptides, etc • Bioinformaticians  share Reanalyses • At least: FDR-filtered results in open format (mzIdentML or mzTab) • Especially useful: algorithms that identify new types of PSMs (e.g., PTM-specific, mixtures) http://massive.ucsd.eduhttp://proteomics.ucsd.edu
  • 27. Juan A. Vizcaíno juan@ebi.ac.uk HUPO 2016 World Conference Taipei, 20 September 2016 Aknowledgements: People Attila Csordas Tobias Ternent Gerhard Mayer (de.NBI) Yasset Perez-Riverol Manuel Bernal-Llinares Andrew Jarnuczak Former team members, especially: Rui Wang Florian Reisinger Noemi del Toro Jose A. Dianes Henning Hermjakob Acknowledgements: The PRIDE Team and all PX partners All data submitters !!! Eric Deutsch Zhi Sun David Campbell Nuno Bandeira Mingxun Wang Jeremy Carver Yasushi Ishihama Shujiro Okuda Shin Kawano Follow new datasets @proteomexchange