SlideShare a Scribd company logo
1 of 39
Exploring the potential of public
proteomics data
Dr. Juan Antonio Vizcaíno
Proteomics Team Leader
EMBL-EBI
Hinxton, Cambridge, UK
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Datasets are being reused more and more….
Vaudel et al., Proteomics, 2016
Data download volume for
PRIDE Archive in 2015: 198 TB
0
50
100
150
200
250
2013 2014 2015 2016
Downloads in TBs
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Data sharing in Proteomics
Vaudel et al., Proteomics, 2016
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Data sharing in Proteomics
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Data sharing in Proteomics
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Data sharing in Proteomics
• Data as they are.
• Protein knowledge bases: UniProt, neXtProt.
• Contributing to the Protein Evidence Code.
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Protein Evidence codes in UniProt/neXtProt
http://www.uniprot.org/help/protein_existence
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Use of MS data in UniProt
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Use of MS data in neXtProt
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Data sharing in Proteomics
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Reuse
• Information is not only extracted, but reused in new
experiments with the potential of generating new
knowledge.
• Transitions used in SRM approaches.
• Meta-analysis approaches.
• Spectral libraries.
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
SRMAtlas
http://www.srmatlas.org/
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
PeptidePicker
http://mrmpeptidepicker.proteincentre.com/
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Meta-analysis approaches
• Putting data coming from a lot of experiments
together, to extract new knowledge. Examples:
• Study the cleavage mechanism and performance of
trypsin.
• Fragmentation patterns.
• Retention time prediction.
• Which is the most suitable reference DB for long-term
proteomics data storage?
• Data integration of experiments done at different time
points.
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Spectral searching
• Concept: To compare experimental spectra to other
experimental spectra.
• There are many spectral libraries publicly available (for
instance, from NIST, PeptideAtlas and PRIDE)
• Custom ‘search engines’ have been developed:
• SpectraST (TPP)
• X!Hunter (GPM)
• Bibliospec
• It has been claimed that the searches have more
sensitivity that with sequence database approaches
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Spectral searching (2)
http://peptide.nist.gov/
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
PRIDE Cluster as a Public Data Mining Resource
17
• http://www.ebi.ac.uk/pride/cluster
• Spectral libraries for 16 species.
• All clustering results, as well as specific subsets of interest available.
• Source code (open source) and Java API
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Data sharing in Proteomics
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Reprocess
• Data are reprocessed with the intention of obtaining
new knowledge or to provide an updated view on the
results.
• It mainly serves the same purpose of the original
experiment.
• For instance, a shot-gun dataset can be reprocessed
with a different algorithm or an updated sequence
database.
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Reprocessing repositories
• These resources collect MS raw data and reprocess it using
one given analysis pipeline, and an up-to date protein
sequence database.
• Main resources: GPMDB and PeptideAtlas (ISB, Seattle).
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
PeptideAtlas and GPMDB
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Draft Human proteome papers published in 2014
Wilhelm et al., Nature, 2014
•Around 60% of the data used for the
analysis comes from previous
experiments, most of them stored in
proteomics repositories such as
PRIDE/ProteomeXchange, PASSEL or
MassIVE.
•They complement that data with “exotic”
tissues.
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Reprocessing for the validation of controversial data
• Analysis of Tyrannosaurus rex fossils: controversial presence of
collagen (is it a contamination of the sample? Did the sample contain
any T. rex proteins at all?)
Asara et al. (2007) Science 316: 280-5.
Asara et al. (2007) Science 316: 1324-5.
Bern et al. (2009) JPR 9: 4328-32
PRIDE Archive assay accession
8633
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Info from R. Chalkley
Bromenshenk et al. (2011) PLOS One 5: e13181
Reprocessing for the validation of controversial data (2)
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Experimental Protocol
1. Collected samples from healthy, collapsing and collapsed bee colonies.
2. Homogenised bees.
3. Digested with Trypsin
4. Analyzed by LC-MSMS on LTQ
5. Searched using Sequest
6. Filtered Results using Peptide and Protein Prophet
7. Performed further analysis to determine species statistically more
commonly found in collapsing/collapsed colony samples
Info from R. Chalkley
Bromenshenk et al. (2011) PLOS One 5: e13181
Reprocessing for the validation of controversial data (3)
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• Big pitfall: Search database was only composed by viral
proteins. Not bee proteins at all!!
• After researching the data, there is no evidence for viral
peptides/proteins in any of their data: honey bee, fruit fly,
wasp, moth, human keratin, bacteria that like sugary
environments, …
• “We believe that there is currently insufficient evidence to
conclude that bees are a natural host for IIV-6, let alone that
the virus is linked to CCD”.
Info from R. Chalkley
Knudsen & Chalkley (2011) PLOS One 6:
e20873
Foster (2011), MCP 10: M110.006387
Reprocessing for the validation of controversial data (4)
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Reprocessing for the validation of controversial data
Datasets PXD000561 and PXD000865 in PRIDE Archive
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Various reanalysis of these datasets have been performed…
Reanalysis of Pandey dataset (Nature, 2014) made by J. Choudhary’s group at
Sanger Institute
Wright et al., Nat Commun, 2016Dataset PXD000561
http://www.ebi.ac.uk/gxa
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Data sharing in Proteomics
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Repurposing
• Data are considered in light of a question or a context
that is different from the original study.
• Proteogenomics studies
• Discovery of novel PTMs.
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Examples of repurposing datasets: proteogenomics
Data in public resources can be used for genome annotation purposes
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Repurposing: new PTMs found
• Individual authors can reprocess raw data with new
hypotheses in mind (not taken into account by the original
authors).
• Recent examples (using phosphoproteomics data sets):
• O-GlcNAc-6-phosphate1
• Phosphoglyceryl2
• ADP-ribosylation3
1Hahne & Kuster, Mol Cell Proteomics (2012) 11 10 1063-9
2Moellering & Cravatt, Science (2013) 341 549-553
3Matic et al., Nat Methods (2012) 9 771-2
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Vaudel M, Barsnes H, Berven FS, Sickmann A,
Martens L:
Proteomics 2011;11(5):996-9.
https://github.com/compomics/searchgui https://github.com/compomics/peptide-shaker
Vaudel M, Burkhart J, Zahedi RP, Berven FS, Sickmann A, Martens L,
Barsnes H:
Nature Biotechnology 2015; 33(1):22-4.
CompOmics Open Source Analysis Pipeline
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Find the desired PRIDE project …
… and start re-analyzing the data!
… inspect the project details ….
Reshake PRIDE data!
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Public datasets from different omics: OmicsDI
http://www.ebi.ac.uk/Tools/omicsdi/
• Aims to integrate of ‘omics’ datasets (proteomics,
transcriptomics, metabolomics and genomics at present).
PRIDE
MassIVE
jPOST
PASSEL
GPMDB
ArrayExpress
Expression Atlas
MetaboLights
Metabolomics Workbench
GNPS
EGA
Perez-Riverol et al., Nat Biotechnol, in press
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
OmicsDI: Portal for omics datasets
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
OmicsDI: Portal for omics datasets
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Acknowledgements
http://www.ncbi.nlm.nih.gov/pubmed/26449181
http://onlinelibrary.wiley.com/doi/10.1002/pmic.201500295/epdf
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Questions?

More Related Content

What's hot

Mass spectrometry resources at the EBI
Mass spectrometry resources at the EBIMass spectrometry resources at the EBI
Mass spectrometry resources at the EBIJuan Antonio Vizcaino
 
Festival of Genomics 2016 London: Challenges of Big Medical Data?
Festival of Genomics 2016 London: Challenges of Big Medical Data?Festival of Genomics 2016 London: Challenges of Big Medical Data?
Festival of Genomics 2016 London: Challenges of Big Medical Data?Matthieu Schapranow
 
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Michel Dumontier
 
Generating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web TechnologiesGenerating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web TechnologiesMichel Dumontier
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance
 
Ondex: Data integration and visualisation
Ondex: Data integration and visualisationOndex: Data integration and visualisation
Ondex: Data integration and visualisationBiogeeks
 
Assessing Drug Safety Using AI
Assessing Drug Safety Using AIAssessing Drug Safety Using AI
Assessing Drug Safety Using AIDatabricks
 
Amia tb-review-11
Amia tb-review-11Amia tb-review-11
Amia tb-review-11Russ Altman
 
Festival of Genomics 2016 London: Agenda
Festival of Genomics 2016 London: AgendaFestival of Genomics 2016 London: Agenda
Festival of Genomics 2016 London: AgendaMatthieu Schapranow
 
Approaches for the Integration of Visual and Computational Analysis of Biomed...
Approaches for the Integration of Visual and Computational Analysis of Biomed...Approaches for the Integration of Visual and Computational Analysis of Biomed...
Approaches for the Integration of Visual and Computational Analysis of Biomed...Nils Gehlenborg
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Juan Antonio Vizcaino
 
Cochrane workshop 2016
Cochrane workshop 2016Cochrane workshop 2016
Cochrane workshop 2016TheContentMine
 
Data Visualization to Enhance our Understanding of the Cancer Genome
Data Visualization to Enhance our Understanding of the Cancer GenomeData Visualization to Enhance our Understanding of the Cancer Genome
Data Visualization to Enhance our Understanding of the Cancer GenomeNils Gehlenborg
 
The State of Open Research Data
The State of Open Research DataThe State of Open Research Data
The State of Open Research DataRoss Mounce
 
Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Philippe Rocca-Serra
 
Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Pistoia Alliance
 
Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literatureAmanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literaturepetermurrayrust
 
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Michel Dumontier
 

What's hot (20)

PRIDE-ProteomeXchange
PRIDE-ProteomeXchangePRIDE-ProteomeXchange
PRIDE-ProteomeXchange
 
Mass spectrometry resources at the EBI
Mass spectrometry resources at the EBIMass spectrometry resources at the EBI
Mass spectrometry resources at the EBI
 
Festival of Genomics 2016 London: Challenges of Big Medical Data?
Festival of Genomics 2016 London: Challenges of Big Medical Data?Festival of Genomics 2016 London: Challenges of Big Medical Data?
Festival of Genomics 2016 London: Challenges of Big Medical Data?
 
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
 
Generating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web TechnologiesGenerating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web Technologies
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier Datathon
 
Ondex: Data integration and visualisation
Ondex: Data integration and visualisationOndex: Data integration and visualisation
Ondex: Data integration and visualisation
 
Assessing Drug Safety Using AI
Assessing Drug Safety Using AIAssessing Drug Safety Using AI
Assessing Drug Safety Using AI
 
Amia tb-review-11
Amia tb-review-11Amia tb-review-11
Amia tb-review-11
 
Festival of Genomics 2016 London: Agenda
Festival of Genomics 2016 London: AgendaFestival of Genomics 2016 London: Agenda
Festival of Genomics 2016 London: Agenda
 
Approaches for the Integration of Visual and Computational Analysis of Biomed...
Approaches for the Integration of Visual and Computational Analysis of Biomed...Approaches for the Integration of Visual and Computational Analysis of Biomed...
Approaches for the Integration of Visual and Computational Analysis of Biomed...
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...
 
Cochrane workshop 2016
Cochrane workshop 2016Cochrane workshop 2016
Cochrane workshop 2016
 
Data Visualization to Enhance our Understanding of the Cancer Genome
Data Visualization to Enhance our Understanding of the Cancer GenomeData Visualization to Enhance our Understanding of the Cancer Genome
Data Visualization to Enhance our Understanding of the Cancer Genome
 
B.3.5
B.3.5B.3.5
B.3.5
 
The State of Open Research Data
The State of Open Research DataThe State of Open Research Data
The State of Open Research Data
 
Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3
 
Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019Ai in drug design webinar 26 feb 2019
Ai in drug design webinar 26 feb 2019
 
Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literatureAmanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature
 
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
 

Viewers also liked

Gritos y susurros
Gritos y susurrosGritos y susurros
Gritos y susurrosorvy
 
女の人生と起業
女の人生と起業女の人生と起業
女の人生と起業Keiko Kano
 
Pymes
PymesPymes
Pymesorvy
 
Finished booklet and infographics
Finished booklet and infographicsFinished booklet and infographics
Finished booklet and infographicsAmelia Browne
 
El cielo
El cieloEl cielo
El cieloaqulino
 
TheVitalTradingLink (1)
TheVitalTradingLink (1)TheVitalTradingLink (1)
TheVitalTradingLink (1)Karine Mazuy
 
Externailidad
ExternailidadExternailidad
Externailidadorvy
 
Certificates of Achievement
Certificates of AchievementCertificates of Achievement
Certificates of AchievementAhmed Moussa
 
Presentaciondelaempresa
PresentaciondelaempresaPresentaciondelaempresa
Presentaciondelaempresavanessapatino
 
Why Cloud Workforce Optimization?
Why Cloud Workforce Optimization?Why Cloud Workforce Optimization?
Why Cloud Workforce Optimization?Adtech Global
 

Viewers also liked (14)

Gritos y susurros
Gritos y susurrosGritos y susurros
Gritos y susurros
 
Plan impulsa
Plan impulsaPlan impulsa
Plan impulsa
 
Dogsface
DogsfaceDogsface
Dogsface
 
女の人生と起業
女の人生と起業女の人生と起業
女の人生と起業
 
Nanci Lynn Gibson
Nanci Lynn GibsonNanci Lynn Gibson
Nanci Lynn Gibson
 
Pymes
PymesPymes
Pymes
 
Finished booklet and infographics
Finished booklet and infographicsFinished booklet and infographics
Finished booklet and infographics
 
Corrosion
CorrosionCorrosion
Corrosion
 
El cielo
El cieloEl cielo
El cielo
 
TheVitalTradingLink (1)
TheVitalTradingLink (1)TheVitalTradingLink (1)
TheVitalTradingLink (1)
 
Externailidad
ExternailidadExternailidad
Externailidad
 
Certificates of Achievement
Certificates of AchievementCertificates of Achievement
Certificates of Achievement
 
Presentaciondelaempresa
PresentaciondelaempresaPresentaciondelaempresa
Presentaciondelaempresa
 
Why Cloud Workforce Optimization?
Why Cloud Workforce Optimization?Why Cloud Workforce Optimization?
Why Cloud Workforce Optimization?
 

Similar to Reuse of public data in proteomics

Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Juan Antonio Vizcaino
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...Juan Antonio Vizcaino
 
An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Juan Antonio Vizcaino
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Juan Antonio Vizcaino
 
Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Juan Antonio Vizcaino
 
Mining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsMining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsJuan Antonio Vizcaino
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Juan Antonio Vizcaino
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...Juan Antonio Vizcaino
 
Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...
Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...
Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...Matthieu Schapranow
 
On community-standards, data curation and scholarly communication" Stanford M...
On community-standards, data curation and scholarly communication" Stanford M...On community-standards, data curation and scholarly communication" Stanford M...
On community-standards, data curation and scholarly communication" Stanford M...Susanna-Assunta Sansone
 

Similar to Reuse of public data in proteomics (20)

Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
 
An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...
 
Pride and ProteomeXchange
Pride and ProteomeXchangePride and ProteomeXchange
Pride and ProteomeXchange
 
Mining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsMining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasets
 
Pride cluster presentation
Pride cluster presentation Pride cluster presentation
Pride cluster presentation
 
Big Data in Life Sciences
Big Data in Life SciencesBig Data in Life Sciences
Big Data in Life Sciences
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
 
Proteomexchange
ProteomexchangeProteomexchange
Proteomexchange
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
 
Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...
Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...
Enabling Real-Time Genome Data Research with In-Memory Database Technology (I...
 
On community-standards, data curation and scholarly communication" Stanford M...
On community-standards, data curation and scholarly communication" Stanford M...On community-standards, data curation and scholarly communication" Stanford M...
On community-standards, data curation and scholarly communication" Stanford M...
 
ProteomeXchange update HUPO 2016
ProteomeXchange update HUPO 2016ProteomeXchange update HUPO 2016
ProteomeXchange update HUPO 2016
 

More from Juan Antonio Vizcaino

Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formatsJuan Antonio Vizcaino
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...Juan Antonio Vizcaino
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateJuan Antonio Vizcaino
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?Juan Antonio Vizcaino
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataJuan Antonio Vizcaino
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...Juan Antonio Vizcaino
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataJuan Antonio Vizcaino
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRJuan Antonio Vizcaino
 
The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)Juan Antonio Vizcaino
 

More from Juan Antonio Vizcaino (16)

Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formats
 
PRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchangePRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchange
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
 
PSI-Proteome Informatics update
PSI-Proteome Informatics updatePSI-Proteome Informatics update
PSI-Proteome Informatics update
 
ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
 
The ELIXIR Proteomics community
The ELIXIR Proteomics community The ELIXIR Proteomics community
The ELIXIR Proteomics community
 
The ELIXIR Proteomics Community
The ELIXIR Proteomics CommunityThe ELIXIR Proteomics Community
The ELIXIR Proteomics Community
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 update
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
 
ProteomeXchange update 2017
ProteomeXchange update 2017ProteomeXchange update 2017
ProteomeXchange update 2017
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics data
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIR
 
The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)
 

Recently uploaded

Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
The Black hole shadow in Modified Gravity
The Black hole shadow in Modified GravityThe Black hole shadow in Modified Gravity
The Black hole shadow in Modified GravitySubhadipsau21168
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)DHURKADEVIBASKAR
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 

Recently uploaded (20)

Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
The Black hole shadow in Modified Gravity
The Black hole shadow in Modified GravityThe Black hole shadow in Modified Gravity
The Black hole shadow in Modified Gravity
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 

Reuse of public data in proteomics

  • 1. Exploring the potential of public proteomics data Dr. Juan Antonio Vizcaíno Proteomics Team Leader EMBL-EBI Hinxton, Cambridge, UK
  • 2. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Datasets are being reused more and more…. Vaudel et al., Proteomics, 2016 Data download volume for PRIDE Archive in 2015: 198 TB 0 50 100 150 200 250 2013 2014 2015 2016 Downloads in TBs
  • 3. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Data sharing in Proteomics Vaudel et al., Proteomics, 2016
  • 4. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Data sharing in Proteomics
  • 5. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Data sharing in Proteomics
  • 6. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Data sharing in Proteomics • Data as they are. • Protein knowledge bases: UniProt, neXtProt. • Contributing to the Protein Evidence Code.
  • 7. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Protein Evidence codes in UniProt/neXtProt http://www.uniprot.org/help/protein_existence
  • 8. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Use of MS data in UniProt
  • 9. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Use of MS data in neXtProt
  • 10. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Data sharing in Proteomics
  • 11. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Reuse • Information is not only extracted, but reused in new experiments with the potential of generating new knowledge. • Transitions used in SRM approaches. • Meta-analysis approaches. • Spectral libraries.
  • 12. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 SRMAtlas http://www.srmatlas.org/
  • 13. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 PeptidePicker http://mrmpeptidepicker.proteincentre.com/
  • 14. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Meta-analysis approaches • Putting data coming from a lot of experiments together, to extract new knowledge. Examples: • Study the cleavage mechanism and performance of trypsin. • Fragmentation patterns. • Retention time prediction. • Which is the most suitable reference DB for long-term proteomics data storage? • Data integration of experiments done at different time points.
  • 15. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Spectral searching • Concept: To compare experimental spectra to other experimental spectra. • There are many spectral libraries publicly available (for instance, from NIST, PeptideAtlas and PRIDE) • Custom ‘search engines’ have been developed: • SpectraST (TPP) • X!Hunter (GPM) • Bibliospec • It has been claimed that the searches have more sensitivity that with sequence database approaches
  • 16. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Spectral searching (2) http://peptide.nist.gov/
  • 17. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 PRIDE Cluster as a Public Data Mining Resource 17 • http://www.ebi.ac.uk/pride/cluster • Spectral libraries for 16 species. • All clustering results, as well as specific subsets of interest available. • Source code (open source) and Java API
  • 18. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Data sharing in Proteomics
  • 19. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Reprocess • Data are reprocessed with the intention of obtaining new knowledge or to provide an updated view on the results. • It mainly serves the same purpose of the original experiment. • For instance, a shot-gun dataset can be reprocessed with a different algorithm or an updated sequence database.
  • 20. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Reprocessing repositories • These resources collect MS raw data and reprocess it using one given analysis pipeline, and an up-to date protein sequence database. • Main resources: GPMDB and PeptideAtlas (ISB, Seattle).
  • 21. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 PeptideAtlas and GPMDB
  • 22. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Draft Human proteome papers published in 2014 Wilhelm et al., Nature, 2014 •Around 60% of the data used for the analysis comes from previous experiments, most of them stored in proteomics repositories such as PRIDE/ProteomeXchange, PASSEL or MassIVE. •They complement that data with “exotic” tissues.
  • 23. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Reprocessing for the validation of controversial data • Analysis of Tyrannosaurus rex fossils: controversial presence of collagen (is it a contamination of the sample? Did the sample contain any T. rex proteins at all?) Asara et al. (2007) Science 316: 280-5. Asara et al. (2007) Science 316: 1324-5. Bern et al. (2009) JPR 9: 4328-32 PRIDE Archive assay accession 8633
  • 24. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Info from R. Chalkley Bromenshenk et al. (2011) PLOS One 5: e13181 Reprocessing for the validation of controversial data (2)
  • 25. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Experimental Protocol 1. Collected samples from healthy, collapsing and collapsed bee colonies. 2. Homogenised bees. 3. Digested with Trypsin 4. Analyzed by LC-MSMS on LTQ 5. Searched using Sequest 6. Filtered Results using Peptide and Protein Prophet 7. Performed further analysis to determine species statistically more commonly found in collapsing/collapsed colony samples Info from R. Chalkley Bromenshenk et al. (2011) PLOS One 5: e13181 Reprocessing for the validation of controversial data (3)
  • 26. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • Big pitfall: Search database was only composed by viral proteins. Not bee proteins at all!! • After researching the data, there is no evidence for viral peptides/proteins in any of their data: honey bee, fruit fly, wasp, moth, human keratin, bacteria that like sugary environments, … • “We believe that there is currently insufficient evidence to conclude that bees are a natural host for IIV-6, let alone that the virus is linked to CCD”. Info from R. Chalkley Knudsen & Chalkley (2011) PLOS One 6: e20873 Foster (2011), MCP 10: M110.006387 Reprocessing for the validation of controversial data (4)
  • 27. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Reprocessing for the validation of controversial data Datasets PXD000561 and PXD000865 in PRIDE Archive
  • 28. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Various reanalysis of these datasets have been performed… Reanalysis of Pandey dataset (Nature, 2014) made by J. Choudhary’s group at Sanger Institute Wright et al., Nat Commun, 2016Dataset PXD000561 http://www.ebi.ac.uk/gxa
  • 29. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Data sharing in Proteomics
  • 30. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Repurposing • Data are considered in light of a question or a context that is different from the original study. • Proteogenomics studies • Discovery of novel PTMs.
  • 31. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Examples of repurposing datasets: proteogenomics Data in public resources can be used for genome annotation purposes
  • 32. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Repurposing: new PTMs found • Individual authors can reprocess raw data with new hypotheses in mind (not taken into account by the original authors). • Recent examples (using phosphoproteomics data sets): • O-GlcNAc-6-phosphate1 • Phosphoglyceryl2 • ADP-ribosylation3 1Hahne & Kuster, Mol Cell Proteomics (2012) 11 10 1063-9 2Moellering & Cravatt, Science (2013) 341 549-553 3Matic et al., Nat Methods (2012) 9 771-2
  • 33. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Vaudel M, Barsnes H, Berven FS, Sickmann A, Martens L: Proteomics 2011;11(5):996-9. https://github.com/compomics/searchgui https://github.com/compomics/peptide-shaker Vaudel M, Burkhart J, Zahedi RP, Berven FS, Sickmann A, Martens L, Barsnes H: Nature Biotechnology 2015; 33(1):22-4. CompOmics Open Source Analysis Pipeline
  • 34. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Find the desired PRIDE project … … and start re-analyzing the data! … inspect the project details …. Reshake PRIDE data!
  • 35. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Public datasets from different omics: OmicsDI http://www.ebi.ac.uk/Tools/omicsdi/ • Aims to integrate of ‘omics’ datasets (proteomics, transcriptomics, metabolomics and genomics at present). PRIDE MassIVE jPOST PASSEL GPMDB ArrayExpress Expression Atlas MetaboLights Metabolomics Workbench GNPS EGA Perez-Riverol et al., Nat Biotechnol, in press
  • 36. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 OmicsDI: Portal for omics datasets
  • 37. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 OmicsDI: Portal for omics datasets
  • 38. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Acknowledgements http://www.ncbi.nlm.nih.gov/pubmed/26449181 http://onlinelibrary.wiley.com/doi/10.1002/pmic.201500295/epdf
  • 39. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Questions?