SlideShare a Scribd company logo
1 of 51
Proteomics repositories
Dr. Juan Antonio Vizcaíno
Proteomics Team Leader
EMBL-EBI
Hinxton, Cambridge, UK
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• Why sharing MS proteomics data?
• Types of information stored in MS proteomics
repositories.
• Main existing repositories and their main
characteristics
• No data reprocessing
• Data reprocessing
• Other resources
Overview
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Corresponding public repositories
Genomics
Transcript-
omics
Proteomics
DNA sequence databases
(GenBank, EMBL, DDJB)
ArrayExpress (EBI), GEO (NCBI)
MS proteomics resources (ProteomeXchange)
Metabolomics MetaboLights (MetabolomeXchange)
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Data sharing in Proteomics
• Proteomics data can be very complex and its interpretation is
often troublesome and/or controversial.
• In other ‘omics’ fields, data sharing ‘culture’ is well established.
Generally, it is considered to be a good scientific practise.
• In proteomics, the ‘culture’ is definitely evolving in that direction.
A big shift is happening in the last few years.
• Scientific journals and funding agencies are two of the main
drivers.
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Reproducible Science
http://www.nature.com/nature/focus/reproducibility/
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
What is a proteomics publication in 2016?
• Proteomics studies generate potentially large amounts of
data and results.
• Ideally, a proteomics publication needs to:
• Summarize the results of the study
• Provide supporting information for reliability of any
results reported
• Information in a publication:
• Manuscript
• Supplementary material
• Associated data submitted to a public repository
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Journal Submission Recommendations
• Journal guidelines recommend and/or mandate
submission to proteomics repositories:
 Proteomics
 Nature Biotechnology
 Nature Methods
 Molecular and Cellular Proteomics
• Funding agencies are enforcing public deposition of data
to maximize the value of the funds provided.
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• Why sharing MS proteomics data?
• Types of information stored in MS
proteomics repositories
• Main existing repositories and their main
characteristics
• No data reprocessing
• Data reprocessing
• Other resources
Overview
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Main types of information stored
• 1) Original experimental data recorded by the mass
spectrometer (primary data) -. Raw data and peak lists.
• 2) Identification results inferred from the original primary
data
• 3) Quantification information
• 4) Experimental and technical metadata
• 5) Any other type of information (e.g. scripts)
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Current PSI Standard File Formats for MS
• mzMLMS data
• mzIdentMLIdentification
• mzQuantMLQuantitation
• mzTabFinal Results
• TraMLSRM
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• Why sharing MS proteomics data?
• Types of information stored in MS proteomics
repositories.
• Main existing repositories and their main
characteristics
• No data reprocessing
• Data reprocessing
• Other resources
Overview
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Proteomics repositories
• Many different workflows need to be supported. They provide
complementary ‘views’.
• No data reprocessing. Data is stored as ‘published’ or
originally analysed:
• PRIDE Archive (focused on MS/MS data, all supported)
• MassIVE (focused on MS/MS data)
• jPOST (focused on MS/MS data)
• PASSEL (only SRM data)
• Data reprocessing (MS/MS data):
• PeptideAtlas and GPMDB
• proteomicsDB and HPM
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
ProteomeXchange: A Global, distributed proteomics
database
PASSEL
(SRM data)
PRIDE
(MS/MS data)
MassIVE
(MS/MS data)
Raw
ID/Q
Meta
jPOST
(MS/MS data)
Mandatory raw data deposition
since July 2015
• Goal: Development of a framework to allow standard data submission and
dissemination pipelines between the main existing proteomics repositories.
http://www.proteomexchange.org
New in 2016
Vizcaíno et al., Nat Biotechnol, 2014
Deutsch et al., NAR, 2017, in press
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• Why sharing MS proteomics data?
• Types of information stored in MS proteomics
repositories.
• Main existing repositories and their main
characteristics
• No data reprocessing
• Data reprocessing
• Other resources
Overview
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Resources that don’t reprocess data
1) Resources that try to represent the authors’ analysis
view on the data.
• Various workflows are allowed and they can provide
complementary results.
• Data are not ‘updated’ in time. However, meta-analysis
on top is possible.
• Accumulation of FDRs when datasets are combined.
• Main representatives: PRIDE Archive and MassIVE
(MS/MS data) and PeptideAtlas/PASSEL (SRM data).
• Data standards are essential.
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Proteomics repositories
• Many different workflows need to be supported. They provide
complementary ‘views’.
• No data reprocessing. Data is stored as ‘published’ or
originally analysed:
• PRIDE Archive (focused on MS/MS data, all supported)
• MassIVE (focused on MS/MS data)
• jPOST (focused on MS/MS data)
• PASSEL (only SRM data)
• Data reprocessing (MS/MS data):
• PeptideAtlas and GPMDB
• proteomicsDB and HPM.
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• PRIDE stores mass spectrometry (MS)-based
proteomics data:
• Peptide and protein expression data
(identification and quantification)
• Post-translational modifications
• Mass spectra (raw data and peak lists)
• Technical and biological metadata
• Any other related information
• Full support for tandem MS approaches
• Any type of data can be stored.
PRIDE (PRoteomics IDEntifications) Archive
http://www.ebi.ac.uk/pride/archive
Martens et al., Proteomics, 2005
Vizcaíno et al., NAR, 2016
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
MassIVE (UCSD)
http://proteomics.ucsd.edu/service/massive/
• Data repository for MS proteomics data
• Tools available for users to analyse their own data
• Joined ProteomeXchange on June 2014.
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
http://massive.ucsd.eduhttp://proteomics.ucsd.edu
MassIVE Interactivity
• MassIVE = Mass spectrometry Interactive Virtual Environment
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
MassIVE: Do it yourself
1. MSGF+ - Database search engine
2. MSPLIT – Spectral Library Search Engine
3. ENOSI – ProteoGenomic Search Engine
4. MODa - Multi-blind modification database search engine
5. Spectral Networks – spectral alignment-based
analysis and propagation of identifications
6. Multi-pass - MSPLIT, MSGFDB, MODa cascade Search
Workflow
7. MSGFDB - Database search engine
8. MSPLIT-DIA – Spectral Library Search for SWATH
9. Upload your own! (mzIdentML, mzTab, TSV)
http://massive.ucsd.eduhttp://proteomics.ucsd.edu
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
jPOST Repository site
(www.jpost.org)
• Joined ProteomeXchange
on July 2016
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• Suitable for SRM assays
• Use the PSI standard TraML
plus the output of the most
popular vendor pipelines
• Started in 2012
• Part of the ProteomeXchange
consortium
http://www.peptideatlas.org/passel/
Farrah et al., Proteomics, 2012
PASSEL: repository for SRM data
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• Why sharing MS proteomics data?
• Types of information stored in MS proteomics
repositories.
• Main existing repositories and their main
characteristics
• No data reprocessing
• Data reprocessing
• Other resources
Overview
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Proteomics repositories
• Many different workflows need to be supported. They provide
complementary ‘views’.
• No data reprocessing. Data is stored as ‘published’ or
originally analysed:
• PRIDE Archive (focused on MS/MS data, all supported)
• MassIVE (focused on MS/MS data)
• jPOST (focused on MS/MS data)
• PASSEL (only SRM data)
• Data reprocessing (MS/MS data):
• PeptideAtlas and GPMDB
• proteomicsDB and HPM.
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Reprocessing repositories
• These resources collect MS raw data and reprocess it using
one given analysis pipeline, and an up to date protein
sequence database.
• Advantage: They provide a ‘standardized’ and updated view
on the experimental data available.
• Only one common analysis method is used and there can be
information loss.
• Different from the author’s view on the data.
• Main resources: GPMDB and PeptideAtlas (ISB, Seattle).
• Novel resources: proteomicsDB and the Human Proteome Map.
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
http://www.peptideatlas.org
- Developed at the Institute for Systems
Biology (ISB, Seattle, USA)
- Peptide identifications from MS/MS
approaches
- Data are reprocessed using the popular
Trans Proteomic Pipeline (TPP)
- Uses PeptideProphet to derive a
probability for the correct identification for
all contained peptides
PeptideAtlas
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• All peptides IDs are mapped to
Ensembl using ProteinProphet
(to handle protein inference)
• Provides proteotypic peptide
predictions
• Limited metadata available
• Part of the HPP project
Deutsch et al., Proteomics, 2005
Desiere et al., NAR, 2006.
Deutsch et al., EMBO Rep, 2008
PeptideAtlas
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Builds are updated in a regular basis (usually once a
year)
Examples of builds:
- Human (HPP context)
- Human plasma
- Human urine
- Drosophila
- Mouse
- Mouse plasma
- Cow
- Yeast
…
PeptideAtlas builds
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• Originally developed by R.
Beavis & R. Craig
• End point of the GPM
proteomics pipeline, to aid in
the process of validating
peptide MS/MS spectra and
protein coverage patterns.
http://gpmdb.thegpm.org/ Craig et al., J Proteome Res, 2004
GPMDB (Global Proteome Machine DB)
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• Data are reprocessed using
the popular X!Tandem or
X!Hunter spectral searching
algorithm
• Also provides proteotypic
peptides
GPMDB
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• Nice visualization features
• Provides very limited
annotation with GO, BTO
• Some support to targeted
approaches is available
• Part of the HPP consortium
GPMDB
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
http://thehpp.org/
The Human Proteome Project (HPP)
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
HPP guidelines version 2.1
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Proteomics repositories
• Many different workflows need to be supported. They provide
complementary ‘views’.
• No data reprocessing. Data is stored as ‘published’ or
originally analysed:
• PRIDE Archive (focused on MS/MS data, all supported)
• MassIVE (focused on MS/MS data)
• jPOST (focused on MS/MS data)
• PASSEL (only SRM data)
• Data reprocessing (MS/MS data):
• PeptideAtlas and GPMDB
• proteomicsDB and HPM
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Draft Human proteome papers published in 2014
Wilhelm et al., Nature, 2014 Kim et al., Nature, 2014
•Two independent groups claimed to have produced the
first complete draft of the human proteome by MS.
• Some of their findings are controversial and need further
validation… but generated a lot of discussion and put
proteomics in the spotlight.
•Two proteomics resources have been developed:
proteomicsDB and the Human Proteome Map (HPM).Nature cover 29 May 2014
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
ProteomicsDB https://www.proteomicsdb.org/
• Data analysis using Mascot and MaxQuant
• The way the Protein FDR is calculated is controversial
•Quantification information using label free techniques
•New datasets are added in a regular basis
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
ProteomicsDB (2)
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Human Proteome Map (HPM)
• Developed by the Pandey group.
• Data reanalysis using Mascot.
• Protein FDR is not mentioned at all in the
corresponding Nature paper.
• Static resource: it will not be updated
any longer.
http://www.humanproteomemap.org/
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• Why sharing MS proteomics data?
• Types of information stored in MS proteomics
repositories.
• Main existing repositories and their main
characteristics
• No data reprocessing
• Data reprocessing
• Other resources
Overview
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Chorus
https://chorusproject.org/pages/ind
ex.html
• Developed by M. MacCoss’ group in
Seattle (UW).
• Built on top of Amazon Cloud
technologies
• Provides data analysis capabilities for
the users
• Free for public datasets.
• The objective is to connect the data to
analysis tools in a cloud environment
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
MaxQB
Human Proteinpedia
Other repositories
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
COPaKB
Cardiac Organellar Protein Atlas Knowledgebase
International collaboration (EMBL-EBI involved)
Windows Client and iPad App
Submit data for analysis in dta and mzML formats
Data submitted to a ProLuCID pipeline
No MS data download
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
CPTAC data portal
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Pep2pro (Arabidopsis)
http://fgcz-pep2pro.uzh.ch/
Centered on Arabidopsis data
Download spectra by spectra
Quantitative information
Linked to gelmap.de (2DE)
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
FINAL THOUGHTS
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Why are repositories not more popular?
1. Don’t want to share data
• Researchers don’t like to be shown that they did not analyze the
data as well as they could have.
• Their FDR may be higher than they reported/think.
• Researchers are worried that they missed something in the data
that they could discover if they go back to it at a later date
• Don’t want other authors to get a publication from their data.
• However, this philosophy is changing gradually…
Slide from R. Chalkley
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Why are repositories not more popular? (2)
2. Submission burden
• Getting data into correct format may require some work
• Author is not necessarily computer-savvy
• Having to also supply metadata is seen as a burden, if the
information is already present in an associated manuscript
• Associated raw data may be many GB in size; file transfer to
repository could take a while
Authors are impatient: want to spend time doing science, not
administration!
Slide from R. Chalkley
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• Importance of sharing MS proteomics data
• The main existing proteomics repositories are
complementary in focus and functionality.
• Main characteristics of:
• PeptideAtlas and GPMDB (Reprocess data)
• PASSEL, MassIVE, jPOST and PRIDE Archive
(at present they do not reprocess data).
• New resources: proteomicsDB, HPM.
• Chorus, CPTAC portal,…
Conclusions
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Reproducible Science
http://www.nature.com/nature/focus/reproducibility/
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
• Perez-Riverol et al., Proteomics, 2015. PMID: 25158685
Recommended reading
Juan A. Vizcaíno
juan@ebi.ac.uk
WT Proteomics Bioinformatics Course 2016
Hinxton, 8 December 2016
Questions?

More Related Content

What's hot

The FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems BiologyThe FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems BiologyFAIRDOM
 
ANDS presentation at AHMEN meeting 6 June 2016
ANDS presentation at AHMEN meeting 6 June 2016ANDS presentation at AHMEN meeting 6 June 2016
ANDS presentation at AHMEN meeting 6 June 2016ARDC
 
Guide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and EducationGuide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and EducationChris Southan
 
Research data: publishers, policies and patient privacy
Research data: publishers, policies and patient privacyResearch data: publishers, policies and patient privacy
Research data: publishers, policies and patient privacyARDC
 
My projects at University of Oxford e-Research Centre - Nov 2014
My projects at University of Oxford e-Research Centre - Nov 2014My projects at University of Oxford e-Research Centre - Nov 2014
My projects at University of Oxford e-Research Centre - Nov 2014Susanna-Assunta Sansone
 
Curatorial data wrangling for the Guide to PHARMACOLGY
Curatorial data wrangling for the Guide to PHARMACOLGY Curatorial data wrangling for the Guide to PHARMACOLGY
Curatorial data wrangling for the Guide to PHARMACOLGY Chris Southan
 
The Rise of the Data Journal
The Rise of the Data JournalThe Rise of the Data Journal
The Rise of the Data JournalMarieke Guy
 
Open interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIPistoia Alliance
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesPistoia Alliance
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataTom Plasterer
 
Electronic Theses and Dissertations in Peru: A Twelve-Year Experience and Its...
Electronic Theses and Dissertations in Peru: A Twelve-Year Experience and Its...Electronic Theses and Dissertations in Peru: A Twelve-Year Experience and Its...
Electronic Theses and Dissertations in Peru: A Twelve-Year Experience and Its...Libio Huaroto
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsCarole Goble
 
schema.org and biomedical ontologies
schema.org and biomedical ontologies schema.org and biomedical ontologies
schema.org and biomedical ontologies Simon Jupp
 
2013 DataCite Summer Meeting - Out of Cite, Out of Mind: Report of the CODATA...
2013 DataCite Summer Meeting - Out of Cite, Out of Mind: Report of the CODATA...2013 DataCite Summer Meeting - Out of Cite, Out of Mind: Report of the CODATA...
2013 DataCite Summer Meeting - Out of Cite, Out of Mind: Report of the CODATA...datacite
 
Capturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor dataCapturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor dataChris Southan
 
Making your data good enough for sharing.
Making your data good enough for sharing.Making your data good enough for sharing.
Making your data good enough for sharing.FAIRDOM
 

What's hot (20)

The FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems BiologyThe FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems Biology
 
ANDS presentation at AHMEN meeting 6 June 2016
ANDS presentation at AHMEN meeting 6 June 2016ANDS presentation at AHMEN meeting 6 June 2016
ANDS presentation at AHMEN meeting 6 June 2016
 
Guide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and EducationGuide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and Education
 
Research data: publishers, policies and patient privacy
Research data: publishers, policies and patient privacyResearch data: publishers, policies and patient privacy
Research data: publishers, policies and patient privacy
 
My projects at University of Oxford e-Research Centre - Nov 2014
My projects at University of Oxford e-Research Centre - Nov 2014My projects at University of Oxford e-Research Centre - Nov 2014
My projects at University of Oxford e-Research Centre - Nov 2014
 
Curatorial data wrangling for the Guide to PHARMACOLGY
Curatorial data wrangling for the Guide to PHARMACOLGY Curatorial data wrangling for the Guide to PHARMACOLGY
Curatorial data wrangling for the Guide to PHARMACOLGY
 
The Rise of the Data Journal
The Rise of the Data JournalThe Rise of the Data Journal
The Rise of the Data Journal
 
Open interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBI
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* Data
 
Electronic Theses and Dissertations in Peru: A Twelve-Year Experience and Its...
Electronic Theses and Dissertations in Peru: A Twelve-Year Experience and Its...Electronic Theses and Dissertations in Peru: A Twelve-Year Experience and Its...
Electronic Theses and Dissertations in Peru: A Twelve-Year Experience and Its...
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
 
schema.org and biomedical ontologies
schema.org and biomedical ontologies schema.org and biomedical ontologies
schema.org and biomedical ontologies
 
Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...
 
ORCID Principles
ORCID PrinciplesORCID Principles
ORCID Principles
 
Stephenson - Data Curation for Quantitative Social Science Research
Stephenson - Data Curation for Quantitative Social Science ResearchStephenson - Data Curation for Quantitative Social Science Research
Stephenson - Data Curation for Quantitative Social Science Research
 
Levine - Data Curation; Ethics and Legal Considerations
Levine - Data Curation; Ethics and Legal ConsiderationsLevine - Data Curation; Ethics and Legal Considerations
Levine - Data Curation; Ethics and Legal Considerations
 
2013 DataCite Summer Meeting - Out of Cite, Out of Mind: Report of the CODATA...
2013 DataCite Summer Meeting - Out of Cite, Out of Mind: Report of the CODATA...2013 DataCite Summer Meeting - Out of Cite, Out of Mind: Report of the CODATA...
2013 DataCite Summer Meeting - Out of Cite, Out of Mind: Report of the CODATA...
 
Capturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor dataCapturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor data
 
Making your data good enough for sharing.
Making your data good enough for sharing.Making your data good enough for sharing.
Making your data good enough for sharing.
 

Similar to Proteomics repositories

An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...Juan Antonio Vizcaino
 
ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015Juan Antonio Vizcaino
 
Mining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsMining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsJuan Antonio Vizcaino
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Juan Antonio Vizcaino
 
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeData volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeJuan Antonio Vizcaino
 
Experiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics fieldExperiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics fieldJuan Antonio Vizcaino
 

Similar to Proteomics repositories (20)

Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Pride and ProteomeXchange
Pride and ProteomeXchangePride and ProteomeXchange
Pride and ProteomeXchange
 
PRIDE-ProteomeXchange
PRIDE-ProteomeXchangePRIDE-ProteomeXchange
PRIDE-ProteomeXchange
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
Reuse of public data in proteomics
Reuse of public data in proteomicsReuse of public data in proteomics
Reuse of public data in proteomics
 
PRIDE and ProteomeXchange
PRIDE and ProteomeXchangePRIDE and ProteomeXchange
PRIDE and ProteomeXchange
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...
 
Proteomexchange
ProteomexchangeProteomexchange
Proteomexchange
 
ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015
 
ProteomeXchange update HUPO 2016
ProteomeXchange update HUPO 2016ProteomeXchange update HUPO 2016
ProteomeXchange update HUPO 2016
 
Mining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsMining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasets
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
 
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeData volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
 
Experiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics fieldExperiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics field
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 

More from Juan Antonio Vizcaino

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Juan Antonio Vizcaino
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formatsJuan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Juan Antonio Vizcaino
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...Juan Antonio Vizcaino
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...Juan Antonio Vizcaino
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateJuan Antonio Vizcaino
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Juan Antonio Vizcaino
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Juan Antonio Vizcaino
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Juan Antonio Vizcaino
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataJuan Antonio Vizcaino
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...Juan Antonio Vizcaino
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataJuan Antonio Vizcaino
 

More from Juan Antonio Vizcaino (20)

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formats
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
PRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchangePRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchange
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
 
PSI-Proteome Informatics update
PSI-Proteome Informatics updatePSI-Proteome Informatics update
PSI-Proteome Informatics update
 
ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
 
The ELIXIR Proteomics community
The ELIXIR Proteomics community The ELIXIR Proteomics community
The ELIXIR Proteomics community
 
The ELIXIR Proteomics Community
The ELIXIR Proteomics CommunityThe ELIXIR Proteomics Community
The ELIXIR Proteomics Community
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 update
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
 
ProteomeXchange update 2017
ProteomeXchange update 2017ProteomeXchange update 2017
ProteomeXchange update 2017
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics data
 

Recently uploaded

Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Recombination DNA Technology (Microinjection)
Recombination DNA Technology (Microinjection)Recombination DNA Technology (Microinjection)
Recombination DNA Technology (Microinjection)Jshifa
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett SquareIsiahStephanRadaza
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.k64182334
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Ahmedabad Call Girls Service 9537192988 can satisfy every one of your dreams
Ahmedabad Call Girls Service 9537192988 can satisfy every one of your dreamsAhmedabad Call Girls Service 9537192988 can satisfy every one of your dreams
Ahmedabad Call Girls Service 9537192988 can satisfy every one of your dreamsoolala9823
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 

Recently uploaded (20)

Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Recombination DNA Technology (Microinjection)
Recombination DNA Technology (Microinjection)Recombination DNA Technology (Microinjection)
Recombination DNA Technology (Microinjection)
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett Square
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Ahmedabad Call Girls Service 9537192988 can satisfy every one of your dreams
Ahmedabad Call Girls Service 9537192988 can satisfy every one of your dreamsAhmedabad Call Girls Service 9537192988 can satisfy every one of your dreams
Ahmedabad Call Girls Service 9537192988 can satisfy every one of your dreams
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 

Proteomics repositories

  • 1. Proteomics repositories Dr. Juan Antonio Vizcaíno Proteomics Team Leader EMBL-EBI Hinxton, Cambridge, UK
  • 2. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • Why sharing MS proteomics data? • Types of information stored in MS proteomics repositories. • Main existing repositories and their main characteristics • No data reprocessing • Data reprocessing • Other resources Overview
  • 3. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Corresponding public repositories Genomics Transcript- omics Proteomics DNA sequence databases (GenBank, EMBL, DDJB) ArrayExpress (EBI), GEO (NCBI) MS proteomics resources (ProteomeXchange) Metabolomics MetaboLights (MetabolomeXchange)
  • 4. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Data sharing in Proteomics • Proteomics data can be very complex and its interpretation is often troublesome and/or controversial. • In other ‘omics’ fields, data sharing ‘culture’ is well established. Generally, it is considered to be a good scientific practise. • In proteomics, the ‘culture’ is definitely evolving in that direction. A big shift is happening in the last few years. • Scientific journals and funding agencies are two of the main drivers.
  • 5. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Reproducible Science http://www.nature.com/nature/focus/reproducibility/
  • 6. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 What is a proteomics publication in 2016? • Proteomics studies generate potentially large amounts of data and results. • Ideally, a proteomics publication needs to: • Summarize the results of the study • Provide supporting information for reliability of any results reported • Information in a publication: • Manuscript • Supplementary material • Associated data submitted to a public repository
  • 7. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Journal Submission Recommendations • Journal guidelines recommend and/or mandate submission to proteomics repositories:  Proteomics  Nature Biotechnology  Nature Methods  Molecular and Cellular Proteomics • Funding agencies are enforcing public deposition of data to maximize the value of the funds provided.
  • 8. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • Why sharing MS proteomics data? • Types of information stored in MS proteomics repositories • Main existing repositories and their main characteristics • No data reprocessing • Data reprocessing • Other resources Overview
  • 9. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Main types of information stored • 1) Original experimental data recorded by the mass spectrometer (primary data) -. Raw data and peak lists. • 2) Identification results inferred from the original primary data • 3) Quantification information • 4) Experimental and technical metadata • 5) Any other type of information (e.g. scripts)
  • 10. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Current PSI Standard File Formats for MS • mzMLMS data • mzIdentMLIdentification • mzQuantMLQuantitation • mzTabFinal Results • TraMLSRM
  • 11. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • Why sharing MS proteomics data? • Types of information stored in MS proteomics repositories. • Main existing repositories and their main characteristics • No data reprocessing • Data reprocessing • Other resources Overview
  • 12. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Proteomics repositories • Many different workflows need to be supported. They provide complementary ‘views’. • No data reprocessing. Data is stored as ‘published’ or originally analysed: • PRIDE Archive (focused on MS/MS data, all supported) • MassIVE (focused on MS/MS data) • jPOST (focused on MS/MS data) • PASSEL (only SRM data) • Data reprocessing (MS/MS data): • PeptideAtlas and GPMDB • proteomicsDB and HPM
  • 13. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 ProteomeXchange: A Global, distributed proteomics database PASSEL (SRM data) PRIDE (MS/MS data) MassIVE (MS/MS data) Raw ID/Q Meta jPOST (MS/MS data) Mandatory raw data deposition since July 2015 • Goal: Development of a framework to allow standard data submission and dissemination pipelines between the main existing proteomics repositories. http://www.proteomexchange.org New in 2016 Vizcaíno et al., Nat Biotechnol, 2014 Deutsch et al., NAR, 2017, in press
  • 14. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • Why sharing MS proteomics data? • Types of information stored in MS proteomics repositories. • Main existing repositories and their main characteristics • No data reprocessing • Data reprocessing • Other resources Overview
  • 15. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Resources that don’t reprocess data 1) Resources that try to represent the authors’ analysis view on the data. • Various workflows are allowed and they can provide complementary results. • Data are not ‘updated’ in time. However, meta-analysis on top is possible. • Accumulation of FDRs when datasets are combined. • Main representatives: PRIDE Archive and MassIVE (MS/MS data) and PeptideAtlas/PASSEL (SRM data). • Data standards are essential.
  • 16. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Proteomics repositories • Many different workflows need to be supported. They provide complementary ‘views’. • No data reprocessing. Data is stored as ‘published’ or originally analysed: • PRIDE Archive (focused on MS/MS data, all supported) • MassIVE (focused on MS/MS data) • jPOST (focused on MS/MS data) • PASSEL (only SRM data) • Data reprocessing (MS/MS data): • PeptideAtlas and GPMDB • proteomicsDB and HPM.
  • 17. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • PRIDE stores mass spectrometry (MS)-based proteomics data: • Peptide and protein expression data (identification and quantification) • Post-translational modifications • Mass spectra (raw data and peak lists) • Technical and biological metadata • Any other related information • Full support for tandem MS approaches • Any type of data can be stored. PRIDE (PRoteomics IDEntifications) Archive http://www.ebi.ac.uk/pride/archive Martens et al., Proteomics, 2005 Vizcaíno et al., NAR, 2016
  • 18. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 MassIVE (UCSD) http://proteomics.ucsd.edu/service/massive/ • Data repository for MS proteomics data • Tools available for users to analyse their own data • Joined ProteomeXchange on June 2014.
  • 19. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 http://massive.ucsd.eduhttp://proteomics.ucsd.edu MassIVE Interactivity • MassIVE = Mass spectrometry Interactive Virtual Environment
  • 20. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 MassIVE: Do it yourself 1. MSGF+ - Database search engine 2. MSPLIT – Spectral Library Search Engine 3. ENOSI – ProteoGenomic Search Engine 4. MODa - Multi-blind modification database search engine 5. Spectral Networks – spectral alignment-based analysis and propagation of identifications 6. Multi-pass - MSPLIT, MSGFDB, MODa cascade Search Workflow 7. MSGFDB - Database search engine 8. MSPLIT-DIA – Spectral Library Search for SWATH 9. Upload your own! (mzIdentML, mzTab, TSV) http://massive.ucsd.eduhttp://proteomics.ucsd.edu
  • 21. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 jPOST Repository site (www.jpost.org) • Joined ProteomeXchange on July 2016
  • 22. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • Suitable for SRM assays • Use the PSI standard TraML plus the output of the most popular vendor pipelines • Started in 2012 • Part of the ProteomeXchange consortium http://www.peptideatlas.org/passel/ Farrah et al., Proteomics, 2012 PASSEL: repository for SRM data
  • 23. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • Why sharing MS proteomics data? • Types of information stored in MS proteomics repositories. • Main existing repositories and their main characteristics • No data reprocessing • Data reprocessing • Other resources Overview
  • 24. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Proteomics repositories • Many different workflows need to be supported. They provide complementary ‘views’. • No data reprocessing. Data is stored as ‘published’ or originally analysed: • PRIDE Archive (focused on MS/MS data, all supported) • MassIVE (focused on MS/MS data) • jPOST (focused on MS/MS data) • PASSEL (only SRM data) • Data reprocessing (MS/MS data): • PeptideAtlas and GPMDB • proteomicsDB and HPM.
  • 25. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Reprocessing repositories • These resources collect MS raw data and reprocess it using one given analysis pipeline, and an up to date protein sequence database. • Advantage: They provide a ‘standardized’ and updated view on the experimental data available. • Only one common analysis method is used and there can be information loss. • Different from the author’s view on the data. • Main resources: GPMDB and PeptideAtlas (ISB, Seattle). • Novel resources: proteomicsDB and the Human Proteome Map.
  • 26. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 http://www.peptideatlas.org - Developed at the Institute for Systems Biology (ISB, Seattle, USA) - Peptide identifications from MS/MS approaches - Data are reprocessed using the popular Trans Proteomic Pipeline (TPP) - Uses PeptideProphet to derive a probability for the correct identification for all contained peptides PeptideAtlas
  • 27. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • All peptides IDs are mapped to Ensembl using ProteinProphet (to handle protein inference) • Provides proteotypic peptide predictions • Limited metadata available • Part of the HPP project Deutsch et al., Proteomics, 2005 Desiere et al., NAR, 2006. Deutsch et al., EMBO Rep, 2008 PeptideAtlas
  • 28. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Builds are updated in a regular basis (usually once a year) Examples of builds: - Human (HPP context) - Human plasma - Human urine - Drosophila - Mouse - Mouse plasma - Cow - Yeast … PeptideAtlas builds
  • 29. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • Originally developed by R. Beavis & R. Craig • End point of the GPM proteomics pipeline, to aid in the process of validating peptide MS/MS spectra and protein coverage patterns. http://gpmdb.thegpm.org/ Craig et al., J Proteome Res, 2004 GPMDB (Global Proteome Machine DB)
  • 30. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • Data are reprocessed using the popular X!Tandem or X!Hunter spectral searching algorithm • Also provides proteotypic peptides GPMDB
  • 31. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • Nice visualization features • Provides very limited annotation with GO, BTO • Some support to targeted approaches is available • Part of the HPP consortium GPMDB
  • 32. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 http://thehpp.org/ The Human Proteome Project (HPP)
  • 33. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 HPP guidelines version 2.1
  • 34. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Proteomics repositories • Many different workflows need to be supported. They provide complementary ‘views’. • No data reprocessing. Data is stored as ‘published’ or originally analysed: • PRIDE Archive (focused on MS/MS data, all supported) • MassIVE (focused on MS/MS data) • jPOST (focused on MS/MS data) • PASSEL (only SRM data) • Data reprocessing (MS/MS data): • PeptideAtlas and GPMDB • proteomicsDB and HPM
  • 35. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Draft Human proteome papers published in 2014 Wilhelm et al., Nature, 2014 Kim et al., Nature, 2014 •Two independent groups claimed to have produced the first complete draft of the human proteome by MS. • Some of their findings are controversial and need further validation… but generated a lot of discussion and put proteomics in the spotlight. •Two proteomics resources have been developed: proteomicsDB and the Human Proteome Map (HPM).Nature cover 29 May 2014
  • 36. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 ProteomicsDB https://www.proteomicsdb.org/ • Data analysis using Mascot and MaxQuant • The way the Protein FDR is calculated is controversial •Quantification information using label free techniques •New datasets are added in a regular basis
  • 37. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 ProteomicsDB (2)
  • 38. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Human Proteome Map (HPM) • Developed by the Pandey group. • Data reanalysis using Mascot. • Protein FDR is not mentioned at all in the corresponding Nature paper. • Static resource: it will not be updated any longer. http://www.humanproteomemap.org/
  • 39. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • Why sharing MS proteomics data? • Types of information stored in MS proteomics repositories. • Main existing repositories and their main characteristics • No data reprocessing • Data reprocessing • Other resources Overview
  • 40. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Chorus https://chorusproject.org/pages/ind ex.html • Developed by M. MacCoss’ group in Seattle (UW). • Built on top of Amazon Cloud technologies • Provides data analysis capabilities for the users • Free for public datasets. • The objective is to connect the data to analysis tools in a cloud environment
  • 41. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 MaxQB Human Proteinpedia Other repositories
  • 42. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 COPaKB Cardiac Organellar Protein Atlas Knowledgebase International collaboration (EMBL-EBI involved) Windows Client and iPad App Submit data for analysis in dta and mzML formats Data submitted to a ProLuCID pipeline No MS data download
  • 43. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 CPTAC data portal
  • 44. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Pep2pro (Arabidopsis) http://fgcz-pep2pro.uzh.ch/ Centered on Arabidopsis data Download spectra by spectra Quantitative information Linked to gelmap.de (2DE)
  • 45. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 FINAL THOUGHTS
  • 46. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Why are repositories not more popular? 1. Don’t want to share data • Researchers don’t like to be shown that they did not analyze the data as well as they could have. • Their FDR may be higher than they reported/think. • Researchers are worried that they missed something in the data that they could discover if they go back to it at a later date • Don’t want other authors to get a publication from their data. • However, this philosophy is changing gradually… Slide from R. Chalkley
  • 47. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Why are repositories not more popular? (2) 2. Submission burden • Getting data into correct format may require some work • Author is not necessarily computer-savvy • Having to also supply metadata is seen as a burden, if the information is already present in an associated manuscript • Associated raw data may be many GB in size; file transfer to repository could take a while Authors are impatient: want to spend time doing science, not administration! Slide from R. Chalkley
  • 48. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • Importance of sharing MS proteomics data • The main existing proteomics repositories are complementary in focus and functionality. • Main characteristics of: • PeptideAtlas and GPMDB (Reprocess data) • PASSEL, MassIVE, jPOST and PRIDE Archive (at present they do not reprocess data). • New resources: proteomicsDB, HPM. • Chorus, CPTAC portal,… Conclusions
  • 49. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Reproducible Science http://www.nature.com/nature/focus/reproducibility/
  • 50. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 • Perez-Riverol et al., Proteomics, 2015. PMID: 25158685 Recommended reading
  • 51. Juan A. Vizcaíno juan@ebi.ac.uk WT Proteomics Bioinformatics Course 2016 Hinxton, 8 December 2016 Questions?