SlideShare a Scribd company logo
1 of 17
PRIDE Resource Team
PRIDE ProteoGenomics
Moving millions of Peptide Evidences into EBI Protein
Resources.
SAB Meeting
EMBL-EBI, November 2018
Moving peptidoforms to ENSEMBL
• Increasing interest to see peptide MS/MS evidences into Genomics context, with
special focus in:
• Post-translational modifications
• Single amino acids variants.
• Interest on expression information and correlation with gene expression.
SAB Meeting
EMBL-EBI, November 2018
PRIDE Peptidome
PX
Complete
.
.
n
Hadoop Cluster
PRIDE Archive Import
Complete Submissions
PX successfully converted
New Peptide/PTMs
Number of Identified and non-Identified Spectra
QC
QC
Number of new clusters
PRIDE Cluster score distribution
Number of clusters by modification
mgf
(Annotated
spectra)
Clustering
Files
Johannes Griss
Visitor Postdoc
Griss J and Perez-Riverol Y, et. al. Nature Methods, 2016
http://wwwdev.ebi.ac.uk/pride/peptidome
VLIPVFALGR 0.98
ITVLEALR 0.95
LFDWANTSR 0.89
ATLNAFLYR 1.00
LAQFDYGR 0.75
.
.
.
SAB Meeting
EMBL-EBI, November 2018
ENSEMBL Track Hub Registry
SAB Meeting
EMBL-EBI, November 2018
Trackhub search
Search results for the keyword “proteome”
Filter by species (organisms) /
Assembly version
SAB Meeting
EMBL-EBI, November 2018
Attached hub (Ensembl)
SAB Meeting
EMBL-EBI, November 2018
Mapping peptides to ENSEMBL
GitHub Tool: https://github.com/bigbio/pgatk/tree/master/PepGenome
For each .pogo file:
• PTMs are standard to a common
representation using PRIDE-Mod library.
• Each Peptide reference to an Assay URL in
PRIDE.
• Each Pogo file is generated automatically by
the PRIDE Pipeline.
chr1 1314335 1314365 VLIPVFALGR 1000 - 1314335 1314335 0,0,0 1 30 0
chr1 1454464 1454488 ITVLEALR 1000 + 1454464 1454464 128,128,128 1 24 0
chr1 1456317 1456344 LFDWANTSR 1000 + 1456317 1456317 128,128,128 1 27 0
chr1 1459184 1459211 ATLNAFLYR 1000 + 1459184 1459184 128,128,128 1 27 0
chr1 1462609 1462633 LAQFDYGR 1000 + 1462609 1462609 128,128,128 1 24 0
chr1 1485135 1485159 ITVLEALR 1000 + 1485135 1485135 128,128,128 1 24 0
chromosome start end feature score itemRgbstrand
SAB Meeting
EMBL-EBI, November 2018
PRIDE Peptidome Pipeline
PX
Complete
.
.
n
Hadoop Cluster
PRIDE Archive Import
PX successfully converted
New Peptide/PTMs
Number of Identified and non-Identified Spectra
QC QC
Number of new clusters
PRIDE Cluster score distribution
Number of clusters by modification
mgf
(Annotated
spectra)
Clustering
Files
Peptide
Tables
(Pogo File)
Peptide Export
Taxonomies
Track
TrackHub Generation
Taxonomies
TrackHub
Registry
Johannes Griss
Visitor Postdoc
• Automatic update when new ENSEMBL Release.
• Support more than 19 species from ENSEMBL.
• Update when a new release of the data in PRIDE Peptidome.
• Support highlight on PTMs, transcript and gene uniqueness.
SAB Meeting
EMBL-EBI, November 2018
Annotating peptide evidences from PRIDE
Archive projects
• 1% FDR PSM level (Combine Results)
• 1% FDR Peptide Level (Combine Results)
Filters (HPP):
• > 8 AA
• 1% FDR at transcript level (inference
needed)
5
TrackHub Registry can search Tracks by:
ShortLabel, LongLabel.
OmicsType: Proteomics, Genomics,
Transcriptomics.
4
PRIDE Submission Pipelines1
mzid
Peak
lists
PX Complete Submission Assays
mztab
mgf
lists
PX Complete Submission
Assays
2
Convert to mztab/mgf and filter evidences do not
pass the reported mzid threshold.
5 Storage of the Project Metadata, Peptide Sequences,
Protein identifiers in Solr and MongoDB.
6
3
Assay
Peptide
Pogo File
PX Complete Submission
Taxonomies
Track
TrackHub Generation
Taxonomies
TrackHub
Registry
4
5
http://ftp.pride.ebi.ac.uk/pride/data/proteogenomics/latest/archive/
SAB Meeting
EMBL-EBI, November 2018
Generation reliable peptide tables
Current Filter options:
• 1% FDR PSM level (Combine
Results)
• 1% FDR Peptide Level (Combine
Results)
Possible Filters (HPP):
• > 8 AA
• 1% FDR at transcript level
(inference needed)
Combine PSM Score:
- Same Spectra, Peptide
- Different Search Engine
Combine Peptide Score:
- Same Peptide
- Different PSMs
Experiment Peptide PSMs Quant
<a href="http://www.ebi.ac.uk/pride/archive/assays/34642">Assay 34642</a> APPLLEGAPFR 1 1.000000
<a href="http://www.ebi.ac.uk/pride/archive/assays/34644">Assay 34644</a> APPLLEGAPFR 1 1.000000
<a href="http://www.ebi.ac.uk/pride/archive/assays/34642">Assay 34642</a> THTQDAVPLTLGQEFSGYVQQVQYAM(oxidation)VR 1 1.000000
<a href="http://www.ebi.ac.uk/pride/archive/assays/34645">Assay 34645</a> KKQVM(oxidation)EK 1 1.000000
<a href="http://www.ebi.ac.uk/pride/archive/assays/34645">Assay 34645</a> VGSGDTNNFPYLEK 2 2.000000
<a href="http://www.ebi.ac.uk/pride/archive/assays/34645">Assay 34645</a> SLTYLSILR 3 3.000000
<a href="http://www.ebi.ac.uk/pride/archive/assays/34642">Assay 34642</a> LPFTPLSYIQGLSHR 8 8.000000
Audain, Enrique, et al. Journal of Proteomics, 2017
PS6PS4
PR1
PR2 PR3
PR4
PR5
P1
P3P2
P4PS1
PR10
PR9
PR7
PR8
PR6
P5
P6
P7
PS2 PS3
PS5
PS7
PS8
SAB Meeting
EMBL-EBI, November 2018
Complete Submissions (Human)
PSMs = 4,374,055.00
MOD-PSMs = 1,225,565.00
http://ftp.pride.ebi.ac.uk/pride/data/proteogenomics/latest/
SAB Meeting
EMBL-EBI, November 2018
Human (hg38) Mouse (mm10)
• Black (all identified
peptides).
• Cyan (oxidation)
• Orange (acetyl)
• Red (phospho)
• 182 PRIDE public datasets.
• 163 from Homo sapiens.
• 15 from Mus musculus.
• 4 from Rattus norvegicus and 2
from Bos Taurus.
• 4 millions peptidoforms including
PTMs
http://ftp.pride.ebi.ac.uk/pride/data/proteogenomics/latest/
SAB Meeting
EMBL-EBI, November 2018
ENSEMBL TrackHub Visualization
SAB Meeting
EMBL-EBI, November 2018
ENSEMBL TrackHub Visualization
SAB Meeting
EMBL-EBI, November 2018
(Mapping also other ProteoXchange
Partners)
We have map more than 1 millions peptides from
PeptideAtlas into ENSEMBL Genome Coordinates.
SAB Meeting
EMBL-EBI, November 2018
Conclusions
Increase the number of submissions map to ESEMBL
coordinates.
Explore the possibility to map from the peptide evidence to
the corresponding spectrum visualizer in PRIDE.
Provide more information about the Disease, Tissue, cell
type when the information get improved in PRIDE.
Develop pipelines to move Intensity-based quantitative
data into ENSEMBL.
Reuse the generated data to improve ENSEMBL
annotations.
SAB Meeting
EMBL-EBI, November 2018
PRIDE Developer Team
@pride_ebi
@proteomexchange
Manuel Bernal-Llinares
(track-hub creator)
Johannes Griss
(pride cluster pipelines)
Christoph Schlaffner
(pogo tool)
Jyoti Choudhary
(PI)
Alessandro Vullo
(trackhub registry)
ENSEMBLTeam
Sanger Team

More Related Content

What's hot (8)

Ashg grc workshop2015_tg
Ashg grc workshop2015_tgAshg grc workshop2015_tg
Ashg grc workshop2015_tg
 
20161021_master_lesson_no_feedback
20161021_master_lesson_no_feedback20161021_master_lesson_no_feedback
20161021_master_lesson_no_feedback
 
Roundtripping between small-molecule and biopolymer representations
Roundtripping between small-molecule and biopolymer representationsRoundtripping between small-molecule and biopolymer representations
Roundtripping between small-molecule and biopolymer representations
 
AGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: SchneiderAGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: Schneider
 
208 1st lecture
208 1st lecture208 1st lecture
208 1st lecture
 
Variant Calling II
Variant Calling IIVariant Calling II
Variant Calling II
 
Agbt2015 workshop schneider
Agbt2015 workshop schneiderAgbt2015 workshop schneider
Agbt2015 workshop schneider
 
Getting the most from the reference assembly
Getting the most from the reference assemblyGetting the most from the reference assembly
Getting the most from the reference assembly
 

Similar to Mapping millions of peptidoforms to Genome Coordinates

Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...Yasset Perez-Riverol
 
SureChEMBL and Open PHACTS
SureChEMBL and Open PHACTSSureChEMBL and Open PHACTS
SureChEMBL and Open PHACTSGeorge Papadatos
 
PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3
PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3
PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3QIAGEN
 
Use of spark for proteomic scoring seattle presentation
Use of spark for  proteomic scoring   seattle presentationUse of spark for  proteomic scoring   seattle presentation
Use of spark for proteomic scoring seattle presentationlordjoe
 
Discovery PBPK: Efficiently using machine learning & PBPK modeling to drive l...
Discovery PBPK: Efficiently using machine learning & PBPK modeling to drive l...Discovery PBPK: Efficiently using machine learning & PBPK modeling to drive l...
Discovery PBPK: Efficiently using machine learning & PBPK modeling to drive l...PhinC Development
 
Host Cell Protein Analysis by Mass Spectrometry | KBI Biopharma
Host Cell Protein Analysis by Mass Spectrometry | KBI BiopharmaHost Cell Protein Analysis by Mass Spectrometry | KBI Biopharma
Host Cell Protein Analysis by Mass Spectrometry | KBI BiopharmaKBI Biopharma
 
CINF 55: SureChEMBL: An open patent chemistry resource
CINF 55: SureChEMBL: An open patent chemistry resourceCINF 55: SureChEMBL: An open patent chemistry resource
CINF 55: SureChEMBL: An open patent chemistry resourceGeorge Papadatos
 
Reporter assay and q pcr application 2012
Reporter assay and q pcr application 2012Reporter assay and q pcr application 2012
Reporter assay and q pcr application 2012Elsa von Licy
 
Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Alex Clark
 
EUGM15 - George Papadatos, Mark Davies, Nathan Dedman (EMBL-EBI): SureChEMBL:...
EUGM15 - George Papadatos, Mark Davies, Nathan Dedman (EMBL-EBI): SureChEMBL:...EUGM15 - George Papadatos, Mark Davies, Nathan Dedman (EMBL-EBI): SureChEMBL:...
EUGM15 - George Papadatos, Mark Davies, Nathan Dedman (EMBL-EBI): SureChEMBL:...ChemAxon
 
Peptide Tribulations in GtoPdb
Peptide Tribulations in GtoPdbPeptide Tribulations in GtoPdb
Peptide Tribulations in GtoPdbChris Southan
 
In-silico Proteolysis of food
In-silico Proteolysis of foodIn-silico Proteolysis of food
In-silico Proteolysis of foodShamim Hossain
 
New Progress in Pyrosequencing for DNA Methylation
New Progress in Pyrosequencing for DNA MethylationNew Progress in Pyrosequencing for DNA Methylation
New Progress in Pyrosequencing for DNA MethylationQIAGEN
 
The Cardiac Organellar Peptide Spectral Library
The Cardiac Organellar Peptide Spectral LibraryThe Cardiac Organellar Peptide Spectral Library
The Cardiac Organellar Peptide Spectral LibraryRafael C. Jimenez
 
GPU-accelerated Virtual Screening
GPU-accelerated Virtual ScreeningGPU-accelerated Virtual Screening
GPU-accelerated Virtual ScreeningOlexandr Isayev
 
Chipqpcrpresentation
ChipqpcrpresentationChipqpcrpresentation
ChipqpcrpresentationElsa von Licy
 
2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issuesDongyan Zhao
 

Similar to Mapping millions of peptidoforms to Genome Coordinates (20)

Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...
 
SureChEMBL and Open PHACTS
SureChEMBL and Open PHACTSSureChEMBL and Open PHACTS
SureChEMBL and Open PHACTS
 
PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3
PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3
PCR Array Data Analysis Tutorial: qPCR Technology Webinar Series Part 3
 
Use of spark for proteomic scoring seattle presentation
Use of spark for  proteomic scoring   seattle presentationUse of spark for  proteomic scoring   seattle presentation
Use of spark for proteomic scoring seattle presentation
 
Discovery PBPK: Efficiently using machine learning & PBPK modeling to drive l...
Discovery PBPK: Efficiently using machine learning & PBPK modeling to drive l...Discovery PBPK: Efficiently using machine learning & PBPK modeling to drive l...
Discovery PBPK: Efficiently using machine learning & PBPK modeling to drive l...
 
Host Cell Protein Analysis by Mass Spectrometry | KBI Biopharma
Host Cell Protein Analysis by Mass Spectrometry | KBI BiopharmaHost Cell Protein Analysis by Mass Spectrometry | KBI Biopharma
Host Cell Protein Analysis by Mass Spectrometry | KBI Biopharma
 
CINF 55: SureChEMBL: An open patent chemistry resource
CINF 55: SureChEMBL: An open patent chemistry resourceCINF 55: SureChEMBL: An open patent chemistry resource
CINF 55: SureChEMBL: An open patent chemistry resource
 
Reporter assay and q pcr application 2012
Reporter assay and q pcr application 2012Reporter assay and q pcr application 2012
Reporter assay and q pcr application 2012
 
Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...
 
Ffpe pcr array
Ffpe pcr arrayFfpe pcr array
Ffpe pcr array
 
EUGM15 - George Papadatos, Mark Davies, Nathan Dedman (EMBL-EBI): SureChEMBL:...
EUGM15 - George Papadatos, Mark Davies, Nathan Dedman (EMBL-EBI): SureChEMBL:...EUGM15 - George Papadatos, Mark Davies, Nathan Dedman (EMBL-EBI): SureChEMBL:...
EUGM15 - George Papadatos, Mark Davies, Nathan Dedman (EMBL-EBI): SureChEMBL:...
 
Peptide Tribulations in GtoPdb
Peptide Tribulations in GtoPdbPeptide Tribulations in GtoPdb
Peptide Tribulations in GtoPdb
 
In-silico Proteolysis of food
In-silico Proteolysis of foodIn-silico Proteolysis of food
In-silico Proteolysis of food
 
Pcrarray
PcrarrayPcrarray
Pcrarray
 
Pcr array 2013
Pcr array 2013Pcr array 2013
Pcr array 2013
 
New Progress in Pyrosequencing for DNA Methylation
New Progress in Pyrosequencing for DNA MethylationNew Progress in Pyrosequencing for DNA Methylation
New Progress in Pyrosequencing for DNA Methylation
 
The Cardiac Organellar Peptide Spectral Library
The Cardiac Organellar Peptide Spectral LibraryThe Cardiac Organellar Peptide Spectral Library
The Cardiac Organellar Peptide Spectral Library
 
GPU-accelerated Virtual Screening
GPU-accelerated Virtual ScreeningGPU-accelerated Virtual Screening
GPU-accelerated Virtual Screening
 
Chipqpcrpresentation
ChipqpcrpresentationChipqpcrpresentation
Chipqpcrpresentation
 
2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues
 

More from Yasset Perez-Riverol

Biocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All HandsBiocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All HandsYasset Perez-Riverol
 
Biocontainers Hackathon Introduction
Biocontainers Hackathon IntroductionBiocontainers Hackathon Introduction
Biocontainers Hackathon IntroductionYasset Perez-Riverol
 
BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017Yasset Perez-Riverol
 
OpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scaleOpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scaleYasset Perez-Riverol
 
Do we need to make public our proteomics data?
Do we need to make public our proteomics data?Do we need to make public our proteomics data?
Do we need to make public our proteomics data?Yasset Perez-Riverol
 
Design of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studiesDesign of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studiesYasset Perez-Riverol
 
Parallel conformational search of small molecules
Parallel conformational search of small moleculesParallel conformational search of small molecules
Parallel conformational search of small moleculesYasset Perez-Riverol
 
Standarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesStandarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesYasset Perez-Riverol
 
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusablePRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable Yasset Perez-Riverol
 
SintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual ScreeningSintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual ScreeningYasset Perez-Riverol
 

More from Yasset Perez-Riverol (14)

Introduction to Proteogenomics
Introduction to Proteogenomics Introduction to Proteogenomics
Introduction to Proteogenomics
 
Biocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All HandsBiocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All Hands
 
Biocontainers Hackathon Introduction
Biocontainers Hackathon IntroductionBiocontainers Hackathon Introduction
Biocontainers Hackathon Introduction
 
BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017
 
OpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scaleOpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scale
 
Do we need to make public our proteomics data?
Do we need to make public our proteomics data?Do we need to make public our proteomics data?
Do we need to make public our proteomics data?
 
Design of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studiesDesign of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studies
 
Parallel conformational search of small molecules
Parallel conformational search of small moleculesParallel conformational search of small molecules
Parallel conformational search of small molecules
 
PBS Web (Spanish)
PBS Web (Spanish)PBS Web (Spanish)
PBS Web (Spanish)
 
Standarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesStandarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata files
 
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusablePRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
 
Yasset perezriverol csi2011
Yasset perezriverol csi2011Yasset perezriverol csi2011
Yasset perezriverol csi2011
 
Yasset iso point-cigb-2012
Yasset iso point-cigb-2012Yasset iso point-cigb-2012
Yasset iso point-cigb-2012
 
SintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual ScreeningSintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual Screening
 

Recently uploaded

Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 

Recently uploaded (20)

Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 

Mapping millions of peptidoforms to Genome Coordinates

  • 1. PRIDE Resource Team PRIDE ProteoGenomics Moving millions of Peptide Evidences into EBI Protein Resources.
  • 2. SAB Meeting EMBL-EBI, November 2018 Moving peptidoforms to ENSEMBL • Increasing interest to see peptide MS/MS evidences into Genomics context, with special focus in: • Post-translational modifications • Single amino acids variants. • Interest on expression information and correlation with gene expression.
  • 3. SAB Meeting EMBL-EBI, November 2018 PRIDE Peptidome PX Complete . . n Hadoop Cluster PRIDE Archive Import Complete Submissions PX successfully converted New Peptide/PTMs Number of Identified and non-Identified Spectra QC QC Number of new clusters PRIDE Cluster score distribution Number of clusters by modification mgf (Annotated spectra) Clustering Files Johannes Griss Visitor Postdoc Griss J and Perez-Riverol Y, et. al. Nature Methods, 2016 http://wwwdev.ebi.ac.uk/pride/peptidome VLIPVFALGR 0.98 ITVLEALR 0.95 LFDWANTSR 0.89 ATLNAFLYR 1.00 LAQFDYGR 0.75 . . .
  • 4. SAB Meeting EMBL-EBI, November 2018 ENSEMBL Track Hub Registry
  • 5. SAB Meeting EMBL-EBI, November 2018 Trackhub search Search results for the keyword “proteome” Filter by species (organisms) / Assembly version
  • 6. SAB Meeting EMBL-EBI, November 2018 Attached hub (Ensembl)
  • 7. SAB Meeting EMBL-EBI, November 2018 Mapping peptides to ENSEMBL GitHub Tool: https://github.com/bigbio/pgatk/tree/master/PepGenome For each .pogo file: • PTMs are standard to a common representation using PRIDE-Mod library. • Each Peptide reference to an Assay URL in PRIDE. • Each Pogo file is generated automatically by the PRIDE Pipeline. chr1 1314335 1314365 VLIPVFALGR 1000 - 1314335 1314335 0,0,0 1 30 0 chr1 1454464 1454488 ITVLEALR 1000 + 1454464 1454464 128,128,128 1 24 0 chr1 1456317 1456344 LFDWANTSR 1000 + 1456317 1456317 128,128,128 1 27 0 chr1 1459184 1459211 ATLNAFLYR 1000 + 1459184 1459184 128,128,128 1 27 0 chr1 1462609 1462633 LAQFDYGR 1000 + 1462609 1462609 128,128,128 1 24 0 chr1 1485135 1485159 ITVLEALR 1000 + 1485135 1485135 128,128,128 1 24 0 chromosome start end feature score itemRgbstrand
  • 8. SAB Meeting EMBL-EBI, November 2018 PRIDE Peptidome Pipeline PX Complete . . n Hadoop Cluster PRIDE Archive Import PX successfully converted New Peptide/PTMs Number of Identified and non-Identified Spectra QC QC Number of new clusters PRIDE Cluster score distribution Number of clusters by modification mgf (Annotated spectra) Clustering Files Peptide Tables (Pogo File) Peptide Export Taxonomies Track TrackHub Generation Taxonomies TrackHub Registry Johannes Griss Visitor Postdoc • Automatic update when new ENSEMBL Release. • Support more than 19 species from ENSEMBL. • Update when a new release of the data in PRIDE Peptidome. • Support highlight on PTMs, transcript and gene uniqueness.
  • 9. SAB Meeting EMBL-EBI, November 2018 Annotating peptide evidences from PRIDE Archive projects • 1% FDR PSM level (Combine Results) • 1% FDR Peptide Level (Combine Results) Filters (HPP): • > 8 AA • 1% FDR at transcript level (inference needed) 5 TrackHub Registry can search Tracks by: ShortLabel, LongLabel. OmicsType: Proteomics, Genomics, Transcriptomics. 4 PRIDE Submission Pipelines1 mzid Peak lists PX Complete Submission Assays mztab mgf lists PX Complete Submission Assays 2 Convert to mztab/mgf and filter evidences do not pass the reported mzid threshold. 5 Storage of the Project Metadata, Peptide Sequences, Protein identifiers in Solr and MongoDB. 6 3 Assay Peptide Pogo File PX Complete Submission Taxonomies Track TrackHub Generation Taxonomies TrackHub Registry 4 5 http://ftp.pride.ebi.ac.uk/pride/data/proteogenomics/latest/archive/
  • 10. SAB Meeting EMBL-EBI, November 2018 Generation reliable peptide tables Current Filter options: • 1% FDR PSM level (Combine Results) • 1% FDR Peptide Level (Combine Results) Possible Filters (HPP): • > 8 AA • 1% FDR at transcript level (inference needed) Combine PSM Score: - Same Spectra, Peptide - Different Search Engine Combine Peptide Score: - Same Peptide - Different PSMs Experiment Peptide PSMs Quant <a href="http://www.ebi.ac.uk/pride/archive/assays/34642">Assay 34642</a> APPLLEGAPFR 1 1.000000 <a href="http://www.ebi.ac.uk/pride/archive/assays/34644">Assay 34644</a> APPLLEGAPFR 1 1.000000 <a href="http://www.ebi.ac.uk/pride/archive/assays/34642">Assay 34642</a> THTQDAVPLTLGQEFSGYVQQVQYAM(oxidation)VR 1 1.000000 <a href="http://www.ebi.ac.uk/pride/archive/assays/34645">Assay 34645</a> KKQVM(oxidation)EK 1 1.000000 <a href="http://www.ebi.ac.uk/pride/archive/assays/34645">Assay 34645</a> VGSGDTNNFPYLEK 2 2.000000 <a href="http://www.ebi.ac.uk/pride/archive/assays/34645">Assay 34645</a> SLTYLSILR 3 3.000000 <a href="http://www.ebi.ac.uk/pride/archive/assays/34642">Assay 34642</a> LPFTPLSYIQGLSHR 8 8.000000 Audain, Enrique, et al. Journal of Proteomics, 2017 PS6PS4 PR1 PR2 PR3 PR4 PR5 P1 P3P2 P4PS1 PR10 PR9 PR7 PR8 PR6 P5 P6 P7 PS2 PS3 PS5 PS7 PS8
  • 11. SAB Meeting EMBL-EBI, November 2018 Complete Submissions (Human) PSMs = 4,374,055.00 MOD-PSMs = 1,225,565.00 http://ftp.pride.ebi.ac.uk/pride/data/proteogenomics/latest/
  • 12. SAB Meeting EMBL-EBI, November 2018 Human (hg38) Mouse (mm10) • Black (all identified peptides). • Cyan (oxidation) • Orange (acetyl) • Red (phospho) • 182 PRIDE public datasets. • 163 from Homo sapiens. • 15 from Mus musculus. • 4 from Rattus norvegicus and 2 from Bos Taurus. • 4 millions peptidoforms including PTMs http://ftp.pride.ebi.ac.uk/pride/data/proteogenomics/latest/
  • 13. SAB Meeting EMBL-EBI, November 2018 ENSEMBL TrackHub Visualization
  • 14. SAB Meeting EMBL-EBI, November 2018 ENSEMBL TrackHub Visualization
  • 15. SAB Meeting EMBL-EBI, November 2018 (Mapping also other ProteoXchange Partners) We have map more than 1 millions peptides from PeptideAtlas into ENSEMBL Genome Coordinates.
  • 16. SAB Meeting EMBL-EBI, November 2018 Conclusions Increase the number of submissions map to ESEMBL coordinates. Explore the possibility to map from the peptide evidence to the corresponding spectrum visualizer in PRIDE. Provide more information about the Disease, Tissue, cell type when the information get improved in PRIDE. Develop pipelines to move Intensity-based quantitative data into ENSEMBL. Reuse the generated data to improve ENSEMBL annotations.
  • 17. SAB Meeting EMBL-EBI, November 2018 PRIDE Developer Team @pride_ebi @proteomexchange Manuel Bernal-Llinares (track-hub creator) Johannes Griss (pride cluster pipelines) Christoph Schlaffner (pogo tool) Jyoti Choudhary (PI) Alessandro Vullo (trackhub registry) ENSEMBLTeam Sanger Team

Editor's Notes

  1. Try to merge KNIME information into slide 7.
  2. Try to merge KNIME information into slide 7.