SlideShare a Scribd company logo
1 of 38
Dr. Yasset Perez-Riverol
Twitter/github: @ypriverol
Proteomics Project Leader
EMBL-EBI
Hinxton, Cambridge, UK
Proteogenomics
Integration of proteomics data in Ensembl.
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Outline
I. Why is so important multi-omics approaches.
II. What is proteogenomics.
III. Proteogenomics by integrating proteomics and
genomics resources.
IV. Ensembl/UCSC Trackhubs.
V. Integration of Proteomics data into Ensembl
trackhubs.
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Ritchie, Marylyn D., et al. "Methods of integrating data to uncover genotype–phenotype interactions." Nature
Reviews Genetics 16.2 (2015): 85.
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Outline
I. Why is so important multi-omics approaches.
II. What is proteogenomics.
III. Proteogenomics by integrating proteomics and
genomics resources.
IV. Ensembl/UCSC Trackhubs.
V. Integration of Proteomics data into Ensembl
trackhubs.
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
What is proteogenomics
Proteogenomics is a field of biological research that utilizes a combination of proteomics,
genomics, and transcriptomics to aid in the discovery and identification/quantification of
peptides and proteins. Proteogenomics is used to identify new peptides by comparing
MS/MS spectra against a protein database that has been derived from genomic and
transcriptomics information.
Gene
annotations
Identification
Novel peptides
Prokaryotic organisms
frameshifts, N-terminal methionine
excision, signal peptides, and other
post-translational modifications.
Multi-omics
Analysis
Correlation between genomics and
proteomics sequence events:
genetic mutations, posttranslational
modifications.
Proteogenomics
Custom Protein sequence
Six-frame / three-
frame translation
RNA-seq data
Exome data
Downstream
analysis
FDR analysis, Filtering rules
Mapping
Reference System
High-Quality
Evidences
FDR analysis, Filtering rules
Downstream
analysis
Correlations
between features
Genome
Coordinates Trackhubs
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Ruggles, Kelly V., et al. "Methods, tools and current
perspectives in proteogenomics." Molecular & Cellular
Proteomics (2017): mcp-000024.
Tool URL
customProDB https://www.bioconductor.org/pa
ckages/release/bioc/html/custo mProDB.html
Galaxy Integrated Omics https://bessantlab.org/software/ gio/
Galaxy-P https://toolshed.g2.bx.psu.edu/v
iew/galaxyp/proteomics_rnaseq
_sap_db_workflow/3a11830963 e3
QUILTS http://quilts.fenyolab.org
MutationDB http://proteomics.ucsd.edu/soft ware-
tools/cancerproteogenomics-4/
SpliceDB http://proteomics.ucsd.edu/soft ware-
tools/cancerproteogenomics-4/
PGA http://bioconductor.org/package s/PGA/
MSProGene http://sourceforge.net/projects/ msprogene/
PPLine https://sourceforge.net/projects/ ppline/
Pypgatk (Python proteo-
genomics analysis toolkit)
https://pgatk.readthedocs.io/en/latest/pyp
gatk.html
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
90% of peptides
most of these peptides are localized within
an exon, and the remaining peptides -
typically less than twenty percent - span an
exon-exon junction.
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
PGATK: ProteoGenomics Analysis Toolkit
https://pgatk.readthedocs.io/en/latest/pypgatk.html
Extenddatabasefor
cancer-orientedstudies
AllMSexperiments
withtofilters:
Taxonomy,Tissues
Cosmic
Cancer
Mutations
cBioportal
Cancer
Mutations
Experimental Design
VCF variants from Ensembl
including:
β€’ germline variations
VCF variants from
Ensembl including:
β€’ Somatic
mutations (filter
by Tissues)
Cancer mutation protein
databases based on
Cosmic and cbBioPortal
β€’ Somatic mutations
(filter by Tissues)
Database Construction
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Mutation VCF files -> Protein databases
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
ProteinDB output and Decoy Generation
>var_rs78350717_22.15528913.C.A_ENST00000252835 protein_coding:missense_variant
MCPLTLQVTGLMNVSEPNSSFAFVNEFILQGFSCEWTIQIFLFSLFTTTYALTITGNGAIAFVLWC
DRRLHTPMYMFLGNFSFLEIWYVSSTVPKMLVNFLSEKKNISFAGCFLQFYFFFSLGTSECLLLT
VMAFDQYLAICRPLLYPNIMTGHLYAKLVILCWVCGFLWFLIPIVLISQMPFCGPNIIDHVVCDPGP
RFALDCVSAPRIQLFCYTLSSLVIFGNFLFIIGSYTLVLKAMLGMPSSTGRHKDFSTCGSHLAVVS
LCYSSLMVMYVSPGLGHSTGMQKIETLFYAMVTPLFNPLIYSLQNKEIKAALRKVLGSSNII*
>var_rs78350717_22.15528913.C.T_ENST00000252835 protein_coding:missense_variant
MCPLTLQVTGLMNVSEPNSSFAFVNEFILQGFSCEWTIQIFLFSLFTTTYALTITGNGAIAFVLWC
DRRLHTPMYMFLGNFSFLEIWYVSSTVPKMLVNFLSEKKNISFAGCFLQFYFFFSLGTSECLLLT
VMAFDQYLAICRPLLYPNIMTGHLYAKLVILCWVCGFLWFLIPIVLISQMPFCGPNIIDHVVCDPGP
RFALDCVSAPRIQLFCYTLSSLVIFGNFLFIIGSYTLVLKAMLGMPSSTGRHKVFSTCGSHLAVVS
LCYSSLMVMYVSPGLGHSTGMQKIETLFYAMVTPLFNPLIYSLQNKEIKAALRKVLGSSNII*
>var_rs78350717_22.15528913.C.A_ENST00000643195 protein_coding:missense_variant
MNVSEPNSSFAFVNEFILQGFSCEWTIQIFLFSLFTTTYALTITGNGAIAFVLWCDRRLHTPMYMF
LGNFSFLEIWYVSSTVPKMLVNFLSEKKNISFAGCFLQFYFFFSLGTSECLLLTVMAFDQYLAICR
PLLYPNIMTGHLYAKLVILCWVCGFLWFLIPIVLISQMPFCGPNIIDHVVCDPGPRFALDCVSAPRI
QLFCYTLSSLVIFGNFLFIIGSYTLVLKAMLGMPSSTGRHKDFSTCGSHLAVVSLCYSSLMVMYV
SPGLGHSTGMQKIETLFYAMVTPLFNPLIYSLQNKEIKAALRKVLGSSNII*
>var_rs78350717_22.15528913.C.T_ENST00000643195 protein_coding:missense_variant
MNVSEPNSSFAFVNEFILQGFSCEWTIQIFLFSLFTTTYALTITGNGAIAFVLWCDRRLHTPMYMF
LGNFSFLEIWYVSSTVPKMLVNFLSEKKNISFAGCFLQFYFFFSLGTSECLLLTVMAFDQYLAICR
PLLYPNIMTGHLYAKLVILCWVCGFLWFLIPIVLISQMPFCGPNIIDHVVCDPGPRFALDCVSAPRI
QLFCYTLSSLVIFGNFLFIIGSYTLVLKAMLGMPSSTGRHKVFSTCGSHLAVVSLCYSSLMVMYV
SPGLGHSTGMQKIETLFYAMVTPLFNPLIYSLQNKEIKAALRKVLGSSNII*
>var_rs147461488_22.17740102.GGCCACGCTCAACT.G_ENST00000342111
protein_coding:frameshift_variant
NGGGRERGS*AWPSWSCDSDSESQEDIIRNIARHLAQVGDSMDRSIPPGLVNGLALQLRNTSR
SEEDRNRDLATALEQLLQAYPRDMEKEKTMLVWPRRWPVTRRPCSVMSFTQQ*ILLTRTYAPT*
GA*PEM
>var_rs147461488_22.17740102.GGCCACGCTCAACT.G_ENST00000617586
protein_coding:frameshift_variant
EGGSARGAIRRKRVVDRVRAPGRRCLGPDAPAPPRLEGGRHWDTVNQE*VGAAALPRPWTVR
STTVPASGMSASQTYWCLASSKAVLTTASAESCRAASAGSPVGGLR*AAD*WQPQQ
https://pgatk.readthedocs.io/en/latest/pypgatk.html
python pypgatk_cli.py generate-decoy -c config/protein_decoy.yaml --input proteindb_from_lincRNA_canonical_sequences.fa --output
decoy_proteindb.fa
Wright, James C., and Jyoti S. Choudhary. "DecoyPyrat: fast non-redundant hybrid decoy
sequence generation for large scale proteomics." Journal of proteomics & bioinformatics
9.6 (2016): 176.
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Reanalyzing the data
Zhu, Yafeng, et al. Nature communications 9.1 (2018): 903.
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Outline
I. Why is so important multi-omics approaches.
II. What is proteogenomics.
III. Proteogenomics by integrating proteomics and
genomics resources.
IV. Ensembl/UCSC Trackhubs.
V. Integration of Proteomics data into Ensembl
trackhubs.
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Integration of genomics / proteomics
resources
The number of multi-omics datasets is still growing but not
enough. An alternative options is to correlate and complement
your proteomics data with existing public genomics data or other
proteomics datasets in PRIDE database.
Multi-omics
Analysis
Correlation between genomics and
proteomics sequence events:
genetic mutations, posttranslational
modifications.
Mapping
Reference System
High-Quality
Evidences
FDR analysis, Filtering rules
Downstream
analysis
Correlations
between features
Genome
Coordinates
Trackhubs
Gene Features
Transcript Features
Protein Features
Protein Family / Structures
.
.
.
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
PRIDE datasets to Ensembl
coordinatesPX Submission
Tool
PRIDE Archive
1 2
PRIDE submission Pipelines
PRIDE
Archive Web
and API
3
TrackHub
Registry
4
PX submission can be Partial or Complete:
Partial Submission: RAW data, SEARCH
Results and Peaks Lists.
Complete Submission: RAW data, Result
Files and Peak Lists, SEARCH Results.
1
Each PX submission can be search by:
Title, Metadata, Description, Tissue,
Taxonomy, PTMs.
Peptide Sequence or Protein Identifier.
3
TrackHub Registry can search Tracks by:
ShortLabel, LongLabel.
OmicsType: Proteomics, Genomics,
Transcriptomics.
4
PRIDE Submission Pipelines2
mzid
Peak
lists
PX Complete Submission
Assays
mztab
mgf
lists
PX Complete Submission
Assays
5
Convert to mztab/mgf and filter evidences do
not pass the reported mzid threshold.
5 Storage of the Project Metadata, Peptide Sequences,
Protein identifiers in Solr and MongoDB.
6
6
Assay
Peptide
Pogo File
PX Complete Submission
Taxonomies
Track
TrackHub Generation
Taxonomies
TrackHub
Registry
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Generation reliable peptide tables
Current Filter options:
β€’ 1% FDR PSM level (Combine
Results)
β€’ 1% FDR Peptide Level (Combine
Results)
Possible Filters (HPP):
β€’ > 8 AA
β€’ 1% FDR at transcript level
(inference needed)
Combine PSM Score:
- Same Spectra, Peptide
- Different Search Engine
Combine Peptide Score:
- Same Peptide
- Different PSMs
Experiment Peptide PSMs Quant
<a href="http://www.ebi.ac.uk/pride/archive/assays/34642">Assay 34642</a> APPLLEGAPFR 1 1.000000
<a href="http://www.ebi.ac.uk/pride/archive/assays/34644">Assay 34644</a> APPLLEGAPFR 1 1.000000
<a href="http://www.ebi.ac.uk/pride/archive/assays/34642">Assay 34642</a> THTQDAVPLTLGQEFSGYVQQVQYAM(oxidation)VR 1 1.000000
<a href="http://www.ebi.ac.uk/pride/archive/assays/34645">Assay 34645</a> KKQVM(oxidation)EK 1 1.000000
<a href="http://www.ebi.ac.uk/pride/archive/assays/34645">Assay 34645</a> VGSGDTNNFPYLEK 2 2.000000
<a href="http://www.ebi.ac.uk/pride/archive/assays/34645">Assay 34645</a> SLTYLSILR 3 3.000000
<a href="http://www.ebi.ac.uk/pride/archive/assays/34642">Assay 34642</a> LPFTPLSYIQGLSHR 8 8.000000
A B DC E
P1 P3P2 P4
PR1
JIG HF
P5
PR1
Protein Inference Toolkit
Protein Groups
Audain, Enrique, et al. "In-depth analysis of protein inference algorithms using multiple search engines and well-
defined metrics." Journal of proteomics 150 (2017): 170-182.
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Mapping peptides to ENSEMBL
https://pgatk.readthedocs.io/en/latest/pepgenome.html
For each .pogo file:
β€’ PTMs are standard to a common
representation using PRIDE-Mod library.
β€’ Each Peptide reference to an Assay URL in
PRIDE.
β€’ Each Pogo file is generated automatically
by the PRIDE Pipeline.
chr1 1314335 1314365 VLIPVFALGR 1000 - 1314335 1314335 0,0,0 1 30 0
chr1 1454464 1454488 ITVLEALR 1000 + 1454464 1454464 128,128,128 1 24 0
chr1 1456317 1456344 LFDWANTSR 1000 + 1456317 1456317 128,128,128 1 27 0
chr1 1459184 1459211 ATLNAFLYR 1000 + 1459184 1459184 128,128,128 1 27 0
chr1 1462609 1462633 LAQFDYGR 1000 + 1462609 1462609 128,128,128 1 24 0
chr1 1485135 1485159 ITVLEALR 1000 + 1485135 1485135 128,128,128 1 24 0
Challenge in the Future:
β€’ Bed information can be extended with more
information about the transcript reliability.
β€’ Peptide uniqueness
β€’ Reliability score.
β€’ Native bigBed should be provided to
remove the customization of new pipelines,
etc.
β€’ What to do with the unmapped peptides
(which are long lists.)
β€’ Maintainability.
chromosome start end feature score itemRgbstrand
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Complete Submissions (Human)
PSMs = 4,374,055.00
MOD-PSMs = 1,225,565.00
http://ftp.pride.ebi.ac.uk/pride/data/proteogenomics/latest/
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Complete Submissions (Mouse)
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
PTMs distribution per chromosome
Human (hg38) Mouse (mm10)
β€’ Black (all identified peptides).
β€’ Cyan (oxidation)
β€’ Orange (acetyl)
β€’ Red (phospho)
https://github.com/bigbio/PepBed
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Oxidation Phosphorylation
PTMs ocurrence (Oxidation/Phospho) on ~20,000 human protein coding genes
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Outline
I. Why is so important multi-omics approaches.
II. What is proteogenomics.
III. Proteogenomics by integrating proteomics and
genomics resources.
IV. Ensembl/UCSC Trackhubs.
V. Integration of Proteomics data into Ensembl
trackhubs.
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Introduction to Genome Browsers
β€’ Browse genes in their genomic context.
β€’ See features in and around a specific gene
β€’ Investigate genome organization and explore larger chromosome regions
Ensembl Genome Browser: http://www.ensembl.org/
UCSC Genome Browser: http://genome.ucsc.edu/
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
ENSEMBL Browser
https://www.ebi.ac.uk/training/online/course/ensembl-browser-webinar-series-2016/introduction-
ensembl
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
What are Track hubs?
β€’ Internet-accessible collections of genome annotations
β€’ Demand on transfer – hub annotations are stored at the
remote site.
β€’ Client side caching.
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Trackhub structure
β€’ hub.txt – defines the labels used to describe the hub
β€’ genomes.txt – describes the assemblies supported by the
hub
β€’ trackDb.txt – describe the data files and defines their
display attributes
β€’ complex format, collection of stanzas
β€’ defines the display and configuration properties
Track database definition document
http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html)
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Track Hubs in Ensembl
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Attaching a hub (Ensembl)
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Attached hub (Ensembl)
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Attached hub (Ensembl)
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Developed By: Alessandro
Vullo
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Trackhub search
Search results for the keyword β€œproteome”
Filter by species (organisms) / Assembly
version
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Trackhub detailed view
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Outline
I. Why is so important multi-omics approaches.
II. What is proteogenomics.
III. Proteogenomics by integrating proteomics and
genomics resources.
IV. Ensembl/UCSC Trackhubs.
V. Integration of Proteomics data into Ensembl
trackhubs.
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019
PRIDE team.
Johannes Griss
(pride cluster pipelines)
Boston Children Hospital
Christoph Schlaffner
(PepGenome tool)
Juan A. Vizcaino
(PI)
Alessandro Vullo
(trackhub registry)
ENSEMBLTeam
Chakradhar Reddy Bandla
(PepGenome tool)
Karolinska Institute
Husen Umer
(PyPGATK tool)
Rui Branca
(PyPGATK tool)
Yafeng Zhu
(PyPGATK tool)
Proteomics Bioinformatics Course
EMBL-EBI, July – 2019

More Related Content

What's hot (20)

Protein-Protein Interactions (PPIs)
Protein-Protein Interactions (PPIs)Protein-Protein Interactions (PPIs)
Protein-Protein Interactions (PPIs)
Β 
Scoring matrices
Scoring matricesScoring matrices
Scoring matrices
Β 
Protein sequence databases
Protein sequence databasesProtein sequence databases
Protein sequence databases
Β 
Swiss prot database
Swiss prot databaseSwiss prot database
Swiss prot database
Β 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
Β 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
Β 
Proteome databases
Proteome databasesProteome databases
Proteome databases
Β 
genomic comparison
genomic comparison genomic comparison
genomic comparison
Β 
Library screening
Library screeningLibrary screening
Library screening
Β 
PROTEIN MICROARRAYS
PROTEIN MICROARRAYSPROTEIN MICROARRAYS
PROTEIN MICROARRAYS
Β 
Proteomics
ProteomicsProteomics
Proteomics
Β 
History and devolopment of bioinfomatics.ppt (1)
History and devolopment of bioinfomatics.ppt (1)History and devolopment of bioinfomatics.ppt (1)
History and devolopment of bioinfomatics.ppt (1)
Β 
Proteomics
ProteomicsProteomics
Proteomics
Β 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
Β 
Bioinformatics t6-phylogenetics v2014
Bioinformatics t6-phylogenetics v2014Bioinformatics t6-phylogenetics v2014
Bioinformatics t6-phylogenetics v2014
Β 
Functional annotation
Functional annotationFunctional annotation
Functional annotation
Β 
Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahu
Β 
Functional proteomics, and tools
Functional proteomics, and toolsFunctional proteomics, and tools
Functional proteomics, and tools
Β 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
Β 
Swiss pdb viewer
Swiss pdb viewerSwiss pdb viewer
Swiss pdb viewer
Β 

Similar to Introduction to Proteogenomics

Protein database
Protein databaseProtein database
Protein databaseKhalid Hakeem
Β 
Introduction to proteomics
Introduction to proteomicsIntroduction to proteomics
Introduction to proteomicsHoffman Lab
Β 
How bioinformatic and sequencing data might inform the regulatory process - O...
How bioinformatic and sequencing data might inform the regulatory process - O...How bioinformatic and sequencing data might inform the regulatory process - O...
How bioinformatic and sequencing data might inform the regulatory process - O...OECD Environment
Β 
FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseRothamsted Research, UK
Β 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchEuropean Bioinformatics Institute
Β 
Bioinformatics
BioinformaticsBioinformatics
BioinformaticsAmna Jalil
Β 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europeopen_phacts
Β 
Lecture at Reading University 2015
Lecture at Reading University 2015Lecture at Reading University 2015
Lecture at Reading University 2015Nicolas Le Novère
Β 
ICSB 2013 - Visits Abroad Report
ICSB 2013 - Visits Abroad ReportICSB 2013 - Visits Abroad Report
ICSB 2013 - Visits Abroad ReportLeighton Pritchard
Β 
Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03Neuro, McGill University
Β 
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...dkNET
Β 
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016Guide to PHARMACOLOGY
Β 
bioinformatics simple
bioinformatics simple bioinformatics simple
bioinformatics simple nadeem akhter
Β 
Analysis with biological pathways:
Analysis with biological pathways: Analysis with biological pathways:
Analysis with biological pathways: Chris Evelo
Β 
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and MetagenomicsCross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics Christopher Mason
Β 

Similar to Introduction to Proteogenomics (20)

Protein database
Protein databaseProtein database
Protein database
Β 
Introduction to proteomics
Introduction to proteomicsIntroduction to proteomics
Introduction to proteomics
Β 
How bioinformatic and sequencing data might inform the regulatory process - O...
How bioinformatic and sequencing data might inform the regulatory process - O...How bioinformatic and sequencing data might inform the regulatory process - O...
How bioinformatic and sequencing data might inform the regulatory process - O...
Β 
FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use Case
Β 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
Β 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Β 
PRIDE-ProteomeXchange
PRIDE-ProteomeXchangePRIDE-ProteomeXchange
PRIDE-ProteomeXchange
Β 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Β 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe
Β 
Lecture at Reading University 2015
Lecture at Reading University 2015Lecture at Reading University 2015
Lecture at Reading University 2015
Β 
proteomics.ppt
proteomics.pptproteomics.ppt
proteomics.ppt
Β 
Xerox2009
Xerox2009Xerox2009
Xerox2009
Β 
ICSB 2013 - Visits Abroad Report
ICSB 2013 - Visits Abroad ReportICSB 2013 - Visits Abroad Report
ICSB 2013 - Visits Abroad Report
Β 
Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03
Β 
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
Β 
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016
Β 
bioinformatics simple
bioinformatics simple bioinformatics simple
bioinformatics simple
Β 
Analysis with biological pathways:
Analysis with biological pathways: Analysis with biological pathways:
Analysis with biological pathways:
Β 
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and MetagenomicsCross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Β 
Omprn 2018 module1_final
Omprn 2018 module1_finalOmprn 2018 module1_final
Omprn 2018 module1_final
Β 

More from Yasset Perez-Riverol

Biocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All HandsBiocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All HandsYasset Perez-Riverol
Β 
Mapping millions of peptidoforms to Genome Coordinates
Mapping millions of peptidoforms to Genome CoordinatesMapping millions of peptidoforms to Genome Coordinates
Mapping millions of peptidoforms to Genome CoordinatesYasset Perez-Riverol
Β 
Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...Yasset Perez-Riverol
Β 
Biocontainers Hackathon Introduction
Biocontainers Hackathon IntroductionBiocontainers Hackathon Introduction
Biocontainers Hackathon IntroductionYasset Perez-Riverol
Β 
BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017Yasset Perez-Riverol
Β 
OpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scaleOpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scaleYasset Perez-Riverol
Β 
Do we need to make public our proteomics data?
Do we need to make public our proteomics data?Do we need to make public our proteomics data?
Do we need to make public our proteomics data?Yasset Perez-Riverol
Β 
Design of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studiesDesign of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studiesYasset Perez-Riverol
Β 
Parallel conformational search of small molecules
Parallel conformational search of small moleculesParallel conformational search of small molecules
Parallel conformational search of small moleculesYasset Perez-Riverol
Β 
Standarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesStandarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesYasset Perez-Riverol
Β 
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusablePRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable Yasset Perez-Riverol
Β 
SintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual ScreeningSintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual ScreeningYasset Perez-Riverol
Β 

More from Yasset Perez-Riverol (15)

Biocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All HandsBiocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All Hands
Β 
Mapping millions of peptidoforms to Genome Coordinates
Mapping millions of peptidoforms to Genome CoordinatesMapping millions of peptidoforms to Genome Coordinates
Mapping millions of peptidoforms to Genome Coordinates
Β 
Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...
Β 
Biocontainers Hackathon Introduction
Biocontainers Hackathon IntroductionBiocontainers Hackathon Introduction
Biocontainers Hackathon Introduction
Β 
BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017
Β 
OpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scaleOpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scale
Β 
Do we need to make public our proteomics data?
Do we need to make public our proteomics data?Do we need to make public our proteomics data?
Do we need to make public our proteomics data?
Β 
Design of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studiesDesign of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studies
Β 
Parallel conformational search of small molecules
Parallel conformational search of small moleculesParallel conformational search of small molecules
Parallel conformational search of small molecules
Β 
PBS Web (Spanish)
PBS Web (Spanish)PBS Web (Spanish)
PBS Web (Spanish)
Β 
Standarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesStandarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata files
Β 
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusablePRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
Β 
Yasset perezriverol csi2011
Yasset perezriverol csi2011Yasset perezriverol csi2011
Yasset perezriverol csi2011
Β 
Yasset iso point-cigb-2012
Yasset iso point-cigb-2012Yasset iso point-cigb-2012
Yasset iso point-cigb-2012
Β 
SintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual ScreeningSintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual Screening
Β 

Recently uploaded

Call Girl Gurgaon Saloni 9711199012 Independent Escort Service Gurgaon
Call Girl Gurgaon Saloni 9711199012 Independent Escort Service GurgaonCall Girl Gurgaon Saloni 9711199012 Independent Escort Service Gurgaon
Call Girl Gurgaon Saloni 9711199012 Independent Escort Service GurgaonCall Girls Service Gurgaon
Β 
hyderabad call girl.pdfRussian Call Girls in Hyderabad Amrita 9907093804 Inde...
hyderabad call girl.pdfRussian Call Girls in Hyderabad Amrita 9907093804 Inde...hyderabad call girl.pdfRussian Call Girls in Hyderabad Amrita 9907093804 Inde...
hyderabad call girl.pdfRussian Call Girls in Hyderabad Amrita 9907093804 Inde...delhimodelshub1
Β 
VIP Call Girls Hyderabad Megha 9907093804 Independent Escort Service Hyderabad
VIP Call Girls Hyderabad Megha 9907093804 Independent Escort Service HyderabadVIP Call Girls Hyderabad Megha 9907093804 Independent Escort Service Hyderabad
VIP Call Girls Hyderabad Megha 9907093804 Independent Escort Service Hyderabaddelhimodelshub1
Β 
Russian Escorts Delhi | 9711199171 | all area service available
Russian Escorts Delhi | 9711199171 | all area service availableRussian Escorts Delhi | 9711199171 | all area service available
Russian Escorts Delhi | 9711199171 | all area service availablesandeepkumar69420
Β 
Call Girls in Hyderabad Lavanya 9907093804 Independent Escort Service Hyderabad
Call Girls in Hyderabad Lavanya 9907093804 Independent Escort Service HyderabadCall Girls in Hyderabad Lavanya 9907093804 Independent Escort Service Hyderabad
Call Girls in Hyderabad Lavanya 9907093804 Independent Escort Service Hyderabaddelhimodelshub1
Β 
Models Call Girls Electronic City | 7001305949 At Low Cost Cash Payment Booking
Models Call Girls Electronic City | 7001305949 At Low Cost Cash Payment BookingModels Call Girls Electronic City | 7001305949 At Low Cost Cash Payment Booking
Models Call Girls Electronic City | 7001305949 At Low Cost Cash Payment Bookingnarwatsonia7
Β 
Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...
Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...
Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...narwatsonia7
Β 
No Advance 9053900678 Chandigarh Call Girls , Indian Call Girls For Full Ni...
No Advance 9053900678 Chandigarh  Call Girls , Indian Call Girls  For Full Ni...No Advance 9053900678 Chandigarh  Call Girls , Indian Call Girls  For Full Ni...
No Advance 9053900678 Chandigarh Call Girls , Indian Call Girls For Full Ni...Vip call girls In Chandigarh
Β 
Russian Call Girls Hyderabad Indira 9907093804 Independent Escort Service Hyd...
Russian Call Girls Hyderabad Indira 9907093804 Independent Escort Service Hyd...Russian Call Girls Hyderabad Indira 9907093804 Independent Escort Service Hyd...
Russian Call Girls Hyderabad Indira 9907093804 Independent Escort Service Hyd...delhimodelshub1
Β 
Russian Call Girls in Hyderabad Ishita 9907093804 Independent Escort Service ...
Russian Call Girls in Hyderabad Ishita 9907093804 Independent Escort Service ...Russian Call Girls in Hyderabad Ishita 9907093804 Independent Escort Service ...
Russian Call Girls in Hyderabad Ishita 9907093804 Independent Escort Service ...delhimodelshub1
Β 
Call Girls Service Chandigarh Grishma β€οΈπŸ‘ 9907093804 πŸ‘„πŸ«¦ Independent Escort Se...
Call Girls Service Chandigarh Grishma β€οΈπŸ‘ 9907093804 πŸ‘„πŸ«¦ Independent Escort Se...Call Girls Service Chandigarh Grishma β€οΈπŸ‘ 9907093804 πŸ‘„πŸ«¦ Independent Escort Se...
Call Girls Service Chandigarh Grishma β€οΈπŸ‘ 9907093804 πŸ‘„πŸ«¦ Independent Escort Se...High Profile Call Girls Chandigarh Aarushi
Β 
Hi,Fi Call Girl In Marathahalli - 7001305949 with real photos and phone numbers
Hi,Fi Call Girl In Marathahalli - 7001305949 with real photos and phone numbersHi,Fi Call Girl In Marathahalli - 7001305949 with real photos and phone numbers
Hi,Fi Call Girl In Marathahalli - 7001305949 with real photos and phone numbersnarwatsonia7
Β 
Book Call Girls in Hosur - 7001305949 | 24x7 Service Available Near Me
Book Call Girls in Hosur - 7001305949 | 24x7 Service Available Near MeBook Call Girls in Hosur - 7001305949 | 24x7 Service Available Near Me
Book Call Girls in Hosur - 7001305949 | 24x7 Service Available Near Menarwatsonia7
Β 
Call Girls Dilsukhnagar 7001305949 all area service COD available Any Time
Call Girls Dilsukhnagar 7001305949 all area service COD available Any TimeCall Girls Dilsukhnagar 7001305949 all area service COD available Any Time
Call Girls Dilsukhnagar 7001305949 all area service COD available Any Timedelhimodelshub1
Β 

Recently uploaded (20)

Call Girl Gurgaon Saloni 9711199012 Independent Escort Service Gurgaon
Call Girl Gurgaon Saloni 9711199012 Independent Escort Service GurgaonCall Girl Gurgaon Saloni 9711199012 Independent Escort Service Gurgaon
Call Girl Gurgaon Saloni 9711199012 Independent Escort Service Gurgaon
Β 
hyderabad call girl.pdfRussian Call Girls in Hyderabad Amrita 9907093804 Inde...
hyderabad call girl.pdfRussian Call Girls in Hyderabad Amrita 9907093804 Inde...hyderabad call girl.pdfRussian Call Girls in Hyderabad Amrita 9907093804 Inde...
hyderabad call girl.pdfRussian Call Girls in Hyderabad Amrita 9907093804 Inde...
Β 
Call Girls in Lucknow Esha πŸ” 8923113531 πŸ” 🎢 Independent Escort Service Lucknow
Call Girls in Lucknow Esha πŸ” 8923113531  πŸ” 🎢 Independent Escort Service LucknowCall Girls in Lucknow Esha πŸ” 8923113531  πŸ” 🎢 Independent Escort Service Lucknow
Call Girls in Lucknow Esha πŸ” 8923113531 πŸ” 🎢 Independent Escort Service Lucknow
Β 
Call Girl Guwahati Aashi πŸ‘‰ 7001305949 πŸ‘ˆ πŸ” Independent Escort Service Guwahati
Call Girl Guwahati Aashi πŸ‘‰ 7001305949 πŸ‘ˆ πŸ” Independent Escort Service GuwahatiCall Girl Guwahati Aashi πŸ‘‰ 7001305949 πŸ‘ˆ πŸ” Independent Escort Service Guwahati
Call Girl Guwahati Aashi πŸ‘‰ 7001305949 πŸ‘ˆ πŸ” Independent Escort Service Guwahati
Β 
Call Girls Guwahati Aaradhya πŸ‘‰ 7001305949πŸ‘ˆ 🎢 Independent Escort Service Guwahati
Call Girls Guwahati Aaradhya πŸ‘‰ 7001305949πŸ‘ˆ 🎢 Independent Escort Service GuwahatiCall Girls Guwahati Aaradhya πŸ‘‰ 7001305949πŸ‘ˆ 🎢 Independent Escort Service Guwahati
Call Girls Guwahati Aaradhya πŸ‘‰ 7001305949πŸ‘ˆ 🎢 Independent Escort Service Guwahati
Β 
VIP Call Girls Hyderabad Megha 9907093804 Independent Escort Service Hyderabad
VIP Call Girls Hyderabad Megha 9907093804 Independent Escort Service HyderabadVIP Call Girls Hyderabad Megha 9907093804 Independent Escort Service Hyderabad
VIP Call Girls Hyderabad Megha 9907093804 Independent Escort Service Hyderabad
Β 
Model Call Girl in Subhash Nagar Delhi reach out to us at πŸ”9953056974πŸ”
Model Call Girl in Subhash Nagar Delhi reach out to us at πŸ”9953056974πŸ”Model Call Girl in Subhash Nagar Delhi reach out to us at πŸ”9953056974πŸ”
Model Call Girl in Subhash Nagar Delhi reach out to us at πŸ”9953056974πŸ”
Β 
Russian Escorts Delhi | 9711199171 | all area service available
Russian Escorts Delhi | 9711199171 | all area service availableRussian Escorts Delhi | 9711199171 | all area service available
Russian Escorts Delhi | 9711199171 | all area service available
Β 
Call Girls in Hyderabad Lavanya 9907093804 Independent Escort Service Hyderabad
Call Girls in Hyderabad Lavanya 9907093804 Independent Escort Service HyderabadCall Girls in Hyderabad Lavanya 9907093804 Independent Escort Service Hyderabad
Call Girls in Hyderabad Lavanya 9907093804 Independent Escort Service Hyderabad
Β 
Models Call Girls Electronic City | 7001305949 At Low Cost Cash Payment Booking
Models Call Girls Electronic City | 7001305949 At Low Cost Cash Payment BookingModels Call Girls Electronic City | 7001305949 At Low Cost Cash Payment Booking
Models Call Girls Electronic City | 7001305949 At Low Cost Cash Payment Booking
Β 
Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...
Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...
Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...
Β 
No Advance 9053900678 Chandigarh Call Girls , Indian Call Girls For Full Ni...
No Advance 9053900678 Chandigarh  Call Girls , Indian Call Girls  For Full Ni...No Advance 9053900678 Chandigarh  Call Girls , Indian Call Girls  For Full Ni...
No Advance 9053900678 Chandigarh Call Girls , Indian Call Girls For Full Ni...
Β 
Russian Call Girls Hyderabad Indira 9907093804 Independent Escort Service Hyd...
Russian Call Girls Hyderabad Indira 9907093804 Independent Escort Service Hyd...Russian Call Girls Hyderabad Indira 9907093804 Independent Escort Service Hyd...
Russian Call Girls Hyderabad Indira 9907093804 Independent Escort Service Hyd...
Β 
Call Girl Dehradun Aashi πŸ” 7001305949 πŸ” πŸ’ƒ Independent Escort Service Dehradun
Call Girl Dehradun Aashi πŸ” 7001305949 πŸ” πŸ’ƒ Independent Escort Service DehradunCall Girl Dehradun Aashi πŸ” 7001305949 πŸ” πŸ’ƒ Independent Escort Service Dehradun
Call Girl Dehradun Aashi πŸ” 7001305949 πŸ” πŸ’ƒ Independent Escort Service Dehradun
Β 
Russian Call Girls in Hyderabad Ishita 9907093804 Independent Escort Service ...
Russian Call Girls in Hyderabad Ishita 9907093804 Independent Escort Service ...Russian Call Girls in Hyderabad Ishita 9907093804 Independent Escort Service ...
Russian Call Girls in Hyderabad Ishita 9907093804 Independent Escort Service ...
Β 
Call Girls Service Chandigarh Grishma β€οΈπŸ‘ 9907093804 πŸ‘„πŸ«¦ Independent Escort Se...
Call Girls Service Chandigarh Grishma β€οΈπŸ‘ 9907093804 πŸ‘„πŸ«¦ Independent Escort Se...Call Girls Service Chandigarh Grishma β€οΈπŸ‘ 9907093804 πŸ‘„πŸ«¦ Independent Escort Se...
Call Girls Service Chandigarh Grishma β€οΈπŸ‘ 9907093804 πŸ‘„πŸ«¦ Independent Escort Se...
Β 
Hi,Fi Call Girl In Marathahalli - 7001305949 with real photos and phone numbers
Hi,Fi Call Girl In Marathahalli - 7001305949 with real photos and phone numbersHi,Fi Call Girl In Marathahalli - 7001305949 with real photos and phone numbers
Hi,Fi Call Girl In Marathahalli - 7001305949 with real photos and phone numbers
Β 
Call Girl Lucknow Gauri πŸ” 8923113531 πŸ” 🎢 Independent Escort Service Lucknow
Call Girl Lucknow Gauri πŸ” 8923113531  πŸ” 🎢 Independent Escort Service LucknowCall Girl Lucknow Gauri πŸ” 8923113531  πŸ” 🎢 Independent Escort Service Lucknow
Call Girl Lucknow Gauri πŸ” 8923113531 πŸ” 🎢 Independent Escort Service Lucknow
Β 
Book Call Girls in Hosur - 7001305949 | 24x7 Service Available Near Me
Book Call Girls in Hosur - 7001305949 | 24x7 Service Available Near MeBook Call Girls in Hosur - 7001305949 | 24x7 Service Available Near Me
Book Call Girls in Hosur - 7001305949 | 24x7 Service Available Near Me
Β 
Call Girls Dilsukhnagar 7001305949 all area service COD available Any Time
Call Girls Dilsukhnagar 7001305949 all area service COD available Any TimeCall Girls Dilsukhnagar 7001305949 all area service COD available Any Time
Call Girls Dilsukhnagar 7001305949 all area service COD available Any Time
Β 

Introduction to Proteogenomics

  • 1. Dr. Yasset Perez-Riverol Twitter/github: @ypriverol Proteomics Project Leader EMBL-EBI Hinxton, Cambridge, UK Proteogenomics Integration of proteomics data in Ensembl.
  • 2. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Outline I. Why is so important multi-omics approaches. II. What is proteogenomics. III. Proteogenomics by integrating proteomics and genomics resources. IV. Ensembl/UCSC Trackhubs. V. Integration of Proteomics data into Ensembl trackhubs.
  • 3. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Ritchie, Marylyn D., et al. "Methods of integrating data to uncover genotype–phenotype interactions." Nature Reviews Genetics 16.2 (2015): 85.
  • 4. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Outline I. Why is so important multi-omics approaches. II. What is proteogenomics. III. Proteogenomics by integrating proteomics and genomics resources. IV. Ensembl/UCSC Trackhubs. V. Integration of Proteomics data into Ensembl trackhubs.
  • 5. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 What is proteogenomics Proteogenomics is a field of biological research that utilizes a combination of proteomics, genomics, and transcriptomics to aid in the discovery and identification/quantification of peptides and proteins. Proteogenomics is used to identify new peptides by comparing MS/MS spectra against a protein database that has been derived from genomic and transcriptomics information. Gene annotations Identification Novel peptides Prokaryotic organisms frameshifts, N-terminal methionine excision, signal peptides, and other post-translational modifications. Multi-omics Analysis Correlation between genomics and proteomics sequence events: genetic mutations, posttranslational modifications. Proteogenomics Custom Protein sequence Six-frame / three- frame translation RNA-seq data Exome data Downstream analysis FDR analysis, Filtering rules Mapping Reference System High-Quality Evidences FDR analysis, Filtering rules Downstream analysis Correlations between features Genome Coordinates Trackhubs
  • 6. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Ruggles, Kelly V., et al. "Methods, tools and current perspectives in proteogenomics." Molecular & Cellular Proteomics (2017): mcp-000024. Tool URL customProDB https://www.bioconductor.org/pa ckages/release/bioc/html/custo mProDB.html Galaxy Integrated Omics https://bessantlab.org/software/ gio/ Galaxy-P https://toolshed.g2.bx.psu.edu/v iew/galaxyp/proteomics_rnaseq _sap_db_workflow/3a11830963 e3 QUILTS http://quilts.fenyolab.org MutationDB http://proteomics.ucsd.edu/soft ware- tools/cancerproteogenomics-4/ SpliceDB http://proteomics.ucsd.edu/soft ware- tools/cancerproteogenomics-4/ PGA http://bioconductor.org/package s/PGA/ MSProGene http://sourceforge.net/projects/ msprogene/ PPLine https://sourceforge.net/projects/ ppline/ Pypgatk (Python proteo- genomics analysis toolkit) https://pgatk.readthedocs.io/en/latest/pyp gatk.html
  • 7. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 90% of peptides most of these peptides are localized within an exon, and the remaining peptides - typically less than twenty percent - span an exon-exon junction.
  • 8. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 PGATK: ProteoGenomics Analysis Toolkit https://pgatk.readthedocs.io/en/latest/pypgatk.html Extenddatabasefor cancer-orientedstudies AllMSexperiments withtofilters: Taxonomy,Tissues Cosmic Cancer Mutations cBioportal Cancer Mutations Experimental Design VCF variants from Ensembl including: β€’ germline variations VCF variants from Ensembl including: β€’ Somatic mutations (filter by Tissues) Cancer mutation protein databases based on Cosmic and cbBioPortal β€’ Somatic mutations (filter by Tissues) Database Construction
  • 9. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Mutation VCF files -> Protein databases
  • 10. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 ProteinDB output and Decoy Generation >var_rs78350717_22.15528913.C.A_ENST00000252835 protein_coding:missense_variant MCPLTLQVTGLMNVSEPNSSFAFVNEFILQGFSCEWTIQIFLFSLFTTTYALTITGNGAIAFVLWC DRRLHTPMYMFLGNFSFLEIWYVSSTVPKMLVNFLSEKKNISFAGCFLQFYFFFSLGTSECLLLT VMAFDQYLAICRPLLYPNIMTGHLYAKLVILCWVCGFLWFLIPIVLISQMPFCGPNIIDHVVCDPGP RFALDCVSAPRIQLFCYTLSSLVIFGNFLFIIGSYTLVLKAMLGMPSSTGRHKDFSTCGSHLAVVS LCYSSLMVMYVSPGLGHSTGMQKIETLFYAMVTPLFNPLIYSLQNKEIKAALRKVLGSSNII* >var_rs78350717_22.15528913.C.T_ENST00000252835 protein_coding:missense_variant MCPLTLQVTGLMNVSEPNSSFAFVNEFILQGFSCEWTIQIFLFSLFTTTYALTITGNGAIAFVLWC DRRLHTPMYMFLGNFSFLEIWYVSSTVPKMLVNFLSEKKNISFAGCFLQFYFFFSLGTSECLLLT VMAFDQYLAICRPLLYPNIMTGHLYAKLVILCWVCGFLWFLIPIVLISQMPFCGPNIIDHVVCDPGP RFALDCVSAPRIQLFCYTLSSLVIFGNFLFIIGSYTLVLKAMLGMPSSTGRHKVFSTCGSHLAVVS LCYSSLMVMYVSPGLGHSTGMQKIETLFYAMVTPLFNPLIYSLQNKEIKAALRKVLGSSNII* >var_rs78350717_22.15528913.C.A_ENST00000643195 protein_coding:missense_variant MNVSEPNSSFAFVNEFILQGFSCEWTIQIFLFSLFTTTYALTITGNGAIAFVLWCDRRLHTPMYMF LGNFSFLEIWYVSSTVPKMLVNFLSEKKNISFAGCFLQFYFFFSLGTSECLLLTVMAFDQYLAICR PLLYPNIMTGHLYAKLVILCWVCGFLWFLIPIVLISQMPFCGPNIIDHVVCDPGPRFALDCVSAPRI QLFCYTLSSLVIFGNFLFIIGSYTLVLKAMLGMPSSTGRHKDFSTCGSHLAVVSLCYSSLMVMYV SPGLGHSTGMQKIETLFYAMVTPLFNPLIYSLQNKEIKAALRKVLGSSNII* >var_rs78350717_22.15528913.C.T_ENST00000643195 protein_coding:missense_variant MNVSEPNSSFAFVNEFILQGFSCEWTIQIFLFSLFTTTYALTITGNGAIAFVLWCDRRLHTPMYMF LGNFSFLEIWYVSSTVPKMLVNFLSEKKNISFAGCFLQFYFFFSLGTSECLLLTVMAFDQYLAICR PLLYPNIMTGHLYAKLVILCWVCGFLWFLIPIVLISQMPFCGPNIIDHVVCDPGPRFALDCVSAPRI QLFCYTLSSLVIFGNFLFIIGSYTLVLKAMLGMPSSTGRHKVFSTCGSHLAVVSLCYSSLMVMYV SPGLGHSTGMQKIETLFYAMVTPLFNPLIYSLQNKEIKAALRKVLGSSNII* >var_rs147461488_22.17740102.GGCCACGCTCAACT.G_ENST00000342111 protein_coding:frameshift_variant NGGGRERGS*AWPSWSCDSDSESQEDIIRNIARHLAQVGDSMDRSIPPGLVNGLALQLRNTSR SEEDRNRDLATALEQLLQAYPRDMEKEKTMLVWPRRWPVTRRPCSVMSFTQQ*ILLTRTYAPT* GA*PEM >var_rs147461488_22.17740102.GGCCACGCTCAACT.G_ENST00000617586 protein_coding:frameshift_variant EGGSARGAIRRKRVVDRVRAPGRRCLGPDAPAPPRLEGGRHWDTVNQE*VGAAALPRPWTVR STTVPASGMSASQTYWCLASSKAVLTTASAESCRAASAGSPVGGLR*AAD*WQPQQ https://pgatk.readthedocs.io/en/latest/pypgatk.html python pypgatk_cli.py generate-decoy -c config/protein_decoy.yaml --input proteindb_from_lincRNA_canonical_sequences.fa --output decoy_proteindb.fa Wright, James C., and Jyoti S. Choudhary. "DecoyPyrat: fast non-redundant hybrid decoy sequence generation for large scale proteomics." Journal of proteomics & bioinformatics 9.6 (2016): 176.
  • 11. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Reanalyzing the data Zhu, Yafeng, et al. Nature communications 9.1 (2018): 903.
  • 12. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Outline I. Why is so important multi-omics approaches. II. What is proteogenomics. III. Proteogenomics by integrating proteomics and genomics resources. IV. Ensembl/UCSC Trackhubs. V. Integration of Proteomics data into Ensembl trackhubs.
  • 13. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Integration of genomics / proteomics resources The number of multi-omics datasets is still growing but not enough. An alternative options is to correlate and complement your proteomics data with existing public genomics data or other proteomics datasets in PRIDE database. Multi-omics Analysis Correlation between genomics and proteomics sequence events: genetic mutations, posttranslational modifications. Mapping Reference System High-Quality Evidences FDR analysis, Filtering rules Downstream analysis Correlations between features Genome Coordinates Trackhubs Gene Features Transcript Features Protein Features Protein Family / Structures . . .
  • 14. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 PRIDE datasets to Ensembl coordinatesPX Submission Tool PRIDE Archive 1 2 PRIDE submission Pipelines PRIDE Archive Web and API 3 TrackHub Registry 4 PX submission can be Partial or Complete: Partial Submission: RAW data, SEARCH Results and Peaks Lists. Complete Submission: RAW data, Result Files and Peak Lists, SEARCH Results. 1 Each PX submission can be search by: Title, Metadata, Description, Tissue, Taxonomy, PTMs. Peptide Sequence or Protein Identifier. 3 TrackHub Registry can search Tracks by: ShortLabel, LongLabel. OmicsType: Proteomics, Genomics, Transcriptomics. 4 PRIDE Submission Pipelines2 mzid Peak lists PX Complete Submission Assays mztab mgf lists PX Complete Submission Assays 5 Convert to mztab/mgf and filter evidences do not pass the reported mzid threshold. 5 Storage of the Project Metadata, Peptide Sequences, Protein identifiers in Solr and MongoDB. 6 6 Assay Peptide Pogo File PX Complete Submission Taxonomies Track TrackHub Generation Taxonomies TrackHub Registry
  • 15. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Generation reliable peptide tables Current Filter options: β€’ 1% FDR PSM level (Combine Results) β€’ 1% FDR Peptide Level (Combine Results) Possible Filters (HPP): β€’ > 8 AA β€’ 1% FDR at transcript level (inference needed) Combine PSM Score: - Same Spectra, Peptide - Different Search Engine Combine Peptide Score: - Same Peptide - Different PSMs Experiment Peptide PSMs Quant <a href="http://www.ebi.ac.uk/pride/archive/assays/34642">Assay 34642</a> APPLLEGAPFR 1 1.000000 <a href="http://www.ebi.ac.uk/pride/archive/assays/34644">Assay 34644</a> APPLLEGAPFR 1 1.000000 <a href="http://www.ebi.ac.uk/pride/archive/assays/34642">Assay 34642</a> THTQDAVPLTLGQEFSGYVQQVQYAM(oxidation)VR 1 1.000000 <a href="http://www.ebi.ac.uk/pride/archive/assays/34645">Assay 34645</a> KKQVM(oxidation)EK 1 1.000000 <a href="http://www.ebi.ac.uk/pride/archive/assays/34645">Assay 34645</a> VGSGDTNNFPYLEK 2 2.000000 <a href="http://www.ebi.ac.uk/pride/archive/assays/34645">Assay 34645</a> SLTYLSILR 3 3.000000 <a href="http://www.ebi.ac.uk/pride/archive/assays/34642">Assay 34642</a> LPFTPLSYIQGLSHR 8 8.000000 A B DC E P1 P3P2 P4 PR1 JIG HF P5 PR1 Protein Inference Toolkit Protein Groups Audain, Enrique, et al. "In-depth analysis of protein inference algorithms using multiple search engines and well- defined metrics." Journal of proteomics 150 (2017): 170-182.
  • 16. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Mapping peptides to ENSEMBL https://pgatk.readthedocs.io/en/latest/pepgenome.html For each .pogo file: β€’ PTMs are standard to a common representation using PRIDE-Mod library. β€’ Each Peptide reference to an Assay URL in PRIDE. β€’ Each Pogo file is generated automatically by the PRIDE Pipeline. chr1 1314335 1314365 VLIPVFALGR 1000 - 1314335 1314335 0,0,0 1 30 0 chr1 1454464 1454488 ITVLEALR 1000 + 1454464 1454464 128,128,128 1 24 0 chr1 1456317 1456344 LFDWANTSR 1000 + 1456317 1456317 128,128,128 1 27 0 chr1 1459184 1459211 ATLNAFLYR 1000 + 1459184 1459184 128,128,128 1 27 0 chr1 1462609 1462633 LAQFDYGR 1000 + 1462609 1462609 128,128,128 1 24 0 chr1 1485135 1485159 ITVLEALR 1000 + 1485135 1485135 128,128,128 1 24 0 Challenge in the Future: β€’ Bed information can be extended with more information about the transcript reliability. β€’ Peptide uniqueness β€’ Reliability score. β€’ Native bigBed should be provided to remove the customization of new pipelines, etc. β€’ What to do with the unmapped peptides (which are long lists.) β€’ Maintainability. chromosome start end feature score itemRgbstrand
  • 17. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Complete Submissions (Human) PSMs = 4,374,055.00 MOD-PSMs = 1,225,565.00 http://ftp.pride.ebi.ac.uk/pride/data/proteogenomics/latest/
  • 18. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Complete Submissions (Mouse)
  • 19. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 PTMs distribution per chromosome Human (hg38) Mouse (mm10) β€’ Black (all identified peptides). β€’ Cyan (oxidation) β€’ Orange (acetyl) β€’ Red (phospho) https://github.com/bigbio/PepBed
  • 20. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Oxidation Phosphorylation PTMs ocurrence (Oxidation/Phospho) on ~20,000 human protein coding genes
  • 21. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Outline I. Why is so important multi-omics approaches. II. What is proteogenomics. III. Proteogenomics by integrating proteomics and genomics resources. IV. Ensembl/UCSC Trackhubs. V. Integration of Proteomics data into Ensembl trackhubs.
  • 22. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Introduction to Genome Browsers β€’ Browse genes in their genomic context. β€’ See features in and around a specific gene β€’ Investigate genome organization and explore larger chromosome regions Ensembl Genome Browser: http://www.ensembl.org/ UCSC Genome Browser: http://genome.ucsc.edu/
  • 23. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 ENSEMBL Browser https://www.ebi.ac.uk/training/online/course/ensembl-browser-webinar-series-2016/introduction- ensembl
  • 24. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 What are Track hubs? β€’ Internet-accessible collections of genome annotations β€’ Demand on transfer – hub annotations are stored at the remote site. β€’ Client side caching.
  • 25. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Trackhub structure β€’ hub.txt – defines the labels used to describe the hub β€’ genomes.txt – describes the assemblies supported by the hub β€’ trackDb.txt – describe the data files and defines their display attributes β€’ complex format, collection of stanzas β€’ defines the display and configuration properties Track database definition document http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html)
  • 26. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Track Hubs in Ensembl
  • 27. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Attaching a hub (Ensembl)
  • 28. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Attached hub (Ensembl)
  • 29. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Attached hub (Ensembl)
  • 30. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Developed By: Alessandro Vullo
  • 31. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Trackhub search Search results for the keyword β€œproteome” Filter by species (organisms) / Assembly version
  • 32. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Trackhub detailed view
  • 33. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 Outline I. Why is so important multi-omics approaches. II. What is proteogenomics. III. Proteogenomics by integrating proteomics and genomics resources. IV. Ensembl/UCSC Trackhubs. V. Integration of Proteomics data into Ensembl trackhubs.
  • 37. Proteomics Bioinformatics Course EMBL-EBI, July – 2019 PRIDE team. Johannes Griss (pride cluster pipelines) Boston Children Hospital Christoph Schlaffner (PepGenome tool) Juan A. Vizcaino (PI) Alessandro Vullo (trackhub registry) ENSEMBLTeam Chakradhar Reddy Bandla (PepGenome tool) Karolinska Institute Husen Umer (PyPGATK tool) Rui Branca (PyPGATK tool) Yafeng Zhu (PyPGATK tool)