SlideShare a Scribd company logo
3. Divergence of protein identifiers
2. Methods
7. References
Will the real pharmacologically significant
proteins please stand up?
1. Introduction
Even in their more contemplative moments probably few pharmacologists cogitate on
“so how many human proteins actually exist?” Nevertheless, on a practical level their
engagement with names and identifiers (IDs) for pharmacological protein targets and
disease mechanistic components is intense and includes navigating between
databases and the literature. This work addresses three important aspects of protein
equivocality that pharmacologists may less aware of but that we encounter head-on
during curation of the IUPHAR/BPS Guide to PHARMACOLOGY [1 2]. These are:
1. Variability in canonical counts between 19,198 from the HUGO Gene Nomenclature
Committee (HGNC) up to 21,341 in GeneCards, indicating a surprising annotation
discordance for at least 10% of the human proteome
2. Uncertainty of alternatively spliced (AS) protein existence. While Ensembl predicts
over 100,000 AS mRNAs, the verification of these by proteomics is 30-fold less than
expected, inferring that the majority do not exist in vivo [3]
3. Evidence that some canonical Swiss-Prot (SP) entries are not the major isoform
Using UniProt we ascertained the 4-way intersect between SP protein IDs, HGNC Gene
Symbols, Ensembl genes and NCBI Gene IDs. The four sets were selected using
cross-reference queries from the UniProt interface. We then accessed our internal
protein statistics including the total human UniProt IDs that we had curated into GtoPdb
and those for which we had annotated data-supported and pharmacologically-relevant
ligand interactions. These were compared to the 4-way sequence set. We also counted
proteins for which UniProt had curated splice forms using the query “Alternative splicing
(KW-0025)”. We then and compared these with our ligand interaction set. We also
inspected one splice form that has been annotated in GtoPdb and checked the
information in SP. To address the isoform abundance question we queried the
Annotation of principal and alternative splice isoforms (APPRIS) database to check
targets [4].
1. Harding SD, et al. (2018). Nucl. Acids Res. 46 (Database Issue): D1091-D1106.
2. Southan C, et al. (2018) ACS Omega 3(7), PMID: 30087946
3. Rodriguez JM et al. (2018). Nucl. Acids Res. 46 (Database Issue) D213-D217.
4. Tress ML, et al (2017) Trends Biochem Sci. 42(2):98-110.
5. Southan C (2017) F1000Res. 7;6:448.
5. Protein alternative splicing
Christopher Southan, Simon D. Harding, Elena Faccenda, Adam J. Pawson and Jamie A. Davies.
IUPHAR/BPS Guide to PHARMACOLOGY, Centre for Discovery Brain Sciences, University of Edinburgh, UK
6. Discussion points
• In addition to AS touched on here, additional sources of protein equivocality and
heterogeneity include alternative initiations and post-translational modifications.
• The multiplexing of these from a (still without a consensus) canonical set of ~19,000
proteins is predicted to run into the millions.
• The significance of this for pharmacology, systems biology and drug discovery is
acknowledged to be high but getting solid experimental data is difficult.
• GtoPdb users are welcome to alert us to potentially curatable papers on differential
ligand interactions related to any forms of protein heterogeneity
www.guidetopharmacology.org enquiries@guidetopharmacology.org @GuidetoPHARM
4. Comparing the consensus with GtoPdb
We especially thank all contributors, collaborators and NC-IUPHAR members
In the Venn diagram on the
right the 4-way intersect
shows that these four major
global pipelines concur for
less than 19,000 protein-
coding genes. Most divergent
is the 829 SP-only set.
Inspection established many
of these are categorised as
pseudogenes by HGNC [5].
This surprising result includes
some missing genomic cross-
mappings inside SP. However,
the consensus is close to the
HGNC count of 19,118 (note
Ensembl and NCBI
reciprocally cross-map hence
the empty sections)
Our next step was to compare the 4-
way set from the comparison above
(blue) with a) all the human proteins we
have entered in GtoPdb (yellow) and
b) those proteins that have a curated
interaction (mostly quantitative) against
one or more of the 9405 ligands (green)
The results were generally as expected
in confirming the majority of our proteins
are within the 4-way set (i.e. solidly
supported). However, the analysis was
valuable in detecting minor anomalies
(represented in segments of 5,6 and
23). These are being followed-up but a
major factor is that some of these are
missing GeneID cross-references in
Swiss-Prot (i.e. are blue false –ves)
It is difficult to find papers with solid data showing AS affecting proteins for which we
have curated ligand interactions and may thus exabit differential pharmacology. Many
publications indicate that AS transcription is a) widespread, b) affects the majority of the
mammalian proteome and is c) is likely to be functionally important in various biological
contexts (e.g. tumours and brain tissue) even if the mechanisms are unclear.
Notwithstanding, there are major uncertainties in proving the existence of AS proteins
since they are difficult to verify in vivo. We approached this question by counting our
interaction proteins with AS sequence variants annotated in Swiss-Prot.
The results of this are shown on
the right. The yellow circle
indicates that 52% of human SP
has at least one AS protein
sequence annotated. This rises
slightly to 54% in our interaction
set (blue). Importantly, AS in SP
is target-class specific rising to
70% for kinases but only 14%
for GPCRs (since many are
single--exon genes). Note that
Ensembl predicts considerably
more potential AS sequences
than SP curates
In GtoPdb we only assign quantitative and differentially-specific AS-ligand interactions
if the papers meet our curatorial stringency. We also need evidence that data-
supported differential binding has pharmacological significance. This is challenging for
many reasons that cannot be expanded on here (but we would be pleads to discuss).
Consequently, we have only one AS entry as the interaction between protein target
2903 as claudin18 and antibody ligand 9209 (below, together with the AS first exon).
The specific case of claudin18 and extrapolation to other AS proteins in GtoPdb
raises the question as to which sequence may be quantitatively dominant (i.e. the
principle isoform in vivo). However, there are inherent challenges of quantifying AS-
specific peptides by mass-spec proteomics or estimating surrogate relative
abundancies from transcription data. We thus chose the APPRIS database which
uses a range of computational methods fold coverage scores to select the most
likely principal isoform. In this case the two SP scored equally.

More Related Content

What's hot

Proteomics a search tool for vaccines
Proteomics a search tool for vaccinesProteomics a search tool for vaccines
Proteomics a search tool for vaccines
Lawrence Okoror
 
Pep Talk San Diego 011311
Pep Talk San Diego 011311Pep Talk San Diego 011311
Pep Talk San Diego 011311
Philip Bourne
 
CSUPerb_2014_Calderon-Final
CSUPerb_2014_Calderon-FinalCSUPerb_2014_Calderon-Final
CSUPerb_2014_Calderon-FinalAlissa Calderon
 
iDiffIR: Identifying differential intron retention from RNA-seq
iDiffIR: Identifying differential intron retention from RNA-seqiDiffIR: Identifying differential intron retention from RNA-seq
iDiffIR: Identifying differential intron retention from RNA-seq
Araport
 
Early view-february-2015nicola-kerbyson
Early view-february-2015nicola-kerbysonEarly view-february-2015nicola-kerbyson
Early view-february-2015nicola-kerbyson
Patricio Crespo
 
The halo(gen) effect in para substituted phenyl rings - EuroCup2015
The halo(gen) effect in para substituted phenyl rings - EuroCup2015The halo(gen) effect in para substituted phenyl rings - EuroCup2015
The halo(gen) effect in para substituted phenyl rings - EuroCup2015
Jonas Boström
 
How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...
How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...
How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...Patrick Dumas
 
Protein-protein interactions of transcription factors from the drought QTL-ho...
Protein-protein interactions of transcription factors from the drought QTL-ho...Protein-protein interactions of transcription factors from the drought QTL-ho...
Protein-protein interactions of transcription factors from the drought QTL-ho...
ICRISAT
 
Rings In (Candidate) Drugs - Case Stories
Rings In (Candidate) Drugs - Case StoriesRings In (Candidate) Drugs - Case Stories
Rings In (Candidate) Drugs - Case Stories
Jonas Boström
 
Duchenne drug tested for muscular dystrophy as drug repurposing
Duchenne drug tested for muscular dystrophy as drug repurposingDuchenne drug tested for muscular dystrophy as drug repurposing
Duchenne drug tested for muscular dystrophy as drug repurposing
movvaharshavardhan
 

What's hot (19)

Proteomics a search tool for vaccines
Proteomics a search tool for vaccinesProteomics a search tool for vaccines
Proteomics a search tool for vaccines
 
Pep Talk San Diego 011311
Pep Talk San Diego 011311Pep Talk San Diego 011311
Pep Talk San Diego 011311
 
2013_WCBSURC.pptx
2013_WCBSURC.pptx2013_WCBSURC.pptx
2013_WCBSURC.pptx
 
CSUPerb_2014_Calderon-Final
CSUPerb_2014_Calderon-FinalCSUPerb_2014_Calderon-Final
CSUPerb_2014_Calderon-Final
 
iDiffIR: Identifying differential intron retention from RNA-seq
iDiffIR: Identifying differential intron retention from RNA-seqiDiffIR: Identifying differential intron retention from RNA-seq
iDiffIR: Identifying differential intron retention from RNA-seq
 
Paper 1 Navisraj
Paper 1 NavisrajPaper 1 Navisraj
Paper 1 Navisraj
 
Early view-february-2015nicola-kerbyson
Early view-february-2015nicola-kerbysonEarly view-february-2015nicola-kerbyson
Early view-february-2015nicola-kerbyson
 
Whyte_2013
Whyte_2013Whyte_2013
Whyte_2013
 
Mazalouskas_2015
Mazalouskas_2015Mazalouskas_2015
Mazalouskas_2015
 
The halo(gen) effect in para substituted phenyl rings - EuroCup2015
The halo(gen) effect in para substituted phenyl rings - EuroCup2015The halo(gen) effect in para substituted phenyl rings - EuroCup2015
The halo(gen) effect in para substituted phenyl rings - EuroCup2015
 
AMQ and AMB poster Korotchenko July 7
AMQ and AMB poster Korotchenko July 7AMQ and AMB poster Korotchenko July 7
AMQ and AMB poster Korotchenko July 7
 
Poster anti-A 1
Poster anti-A 1Poster anti-A 1
Poster anti-A 1
 
a-FMH Poster
a-FMH Postera-FMH Poster
a-FMH Poster
 
How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...
How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...
How the Presence of DEC1, 2 and ER-α Affect the Regulation of ERE Promoter in...
 
Protein-protein interactions of transcription factors from the drought QTL-ho...
Protein-protein interactions of transcription factors from the drought QTL-ho...Protein-protein interactions of transcription factors from the drought QTL-ho...
Protein-protein interactions of transcription factors from the drought QTL-ho...
 
Rings In (Candidate) Drugs - Case Stories
Rings In (Candidate) Drugs - Case StoriesRings In (Candidate) Drugs - Case Stories
Rings In (Candidate) Drugs - Case Stories
 
Seah_SURF (1)
Seah_SURF (1)Seah_SURF (1)
Seah_SURF (1)
 
Duchenne drug tested for muscular dystrophy as drug repurposing
Duchenne drug tested for muscular dystrophy as drug repurposingDuchenne drug tested for muscular dystrophy as drug repurposing
Duchenne drug tested for muscular dystrophy as drug repurposing
 
news and views
news and viewsnews and views
news and views
 

Similar to Will the real proteins please stand up

Screening Of Mdr1 [Autosaved]
Screening Of  Mdr1 [Autosaved]Screening Of  Mdr1 [Autosaved]
Screening Of Mdr1 [Autosaved]Pooja1923
 
Instem-Orthologues-Handout
Instem-Orthologues-HandoutInstem-Orthologues-Handout
Instem-Orthologues-HandoutMark Miller
 
Identification of PFOA linked metabolic diseases by crossing databases
Identification of PFOA linked metabolic diseases by crossing databasesIdentification of PFOA linked metabolic diseases by crossing databases
Identification of PFOA linked metabolic diseases by crossing databases
Yoann Pageaud
 
Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212
Philip Bourne
 
MathiasHibbard_604FinalPaper
MathiasHibbard_604FinalPaperMathiasHibbard_604FinalPaper
MathiasHibbard_604FinalPaperMathias Hibbard
 
Update on the Druggable Proteome
Update on the Druggable ProteomeUpdate on the Druggable Proteome
Update on the Druggable Proteome
Chris Southan
 
Integrative regulatory genomics for target gene prioritisation in SLE
Integrative regulatory genomics for target gene prioritisation in SLEIntegrative regulatory genomics for target gene prioritisation in SLE
Integrative regulatory genomics for target gene prioritisation in SLE
Enrico Ferrero
 
High similarity among ChEC-seq datasets.pdf
High similarity among ChEC-seq datasets.pdfHigh similarity among ChEC-seq datasets.pdf
High similarity among ChEC-seq datasets.pdf
Cornell University
 
Rehmat ullah assignment
Rehmat ullah assignmentRehmat ullah assignment
Rehmat ullah assignment
UmarRasheed16
 
Integrative regulatory genomics for target gene prioritisation in SLE
Integrative regulatory genomics for target gene prioritisation in SLEIntegrative regulatory genomics for target gene prioritisation in SLE
Integrative regulatory genomics for target gene prioritisation in SLE
Enrico Ferrero
 
Bioinformatic jc 08_14_2013_formal
Bioinformatic jc 08_14_2013_formalBioinformatic jc 08_14_2013_formal
Bioinformatic jc 08_14_2013_formal
Jennifer Shelton
 
Bioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung CancerBioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung CancerPreveenRamamoorthy
 
Arf6 Reliability Paper - LinkedIn
Arf6 Reliability Paper - LinkedInArf6 Reliability Paper - LinkedIn
Arf6 Reliability Paper - LinkedInKenneth Hee
 
Systems Pharmacology as a tool for future therapy development: a feasibility ...
Systems Pharmacology as a tool for future therapy development: a feasibility ...Systems Pharmacology as a tool for future therapy development: a feasibility ...
Systems Pharmacology as a tool for future therapy development: a feasibility ...
Guide to PHARMACOLOGY
 
Analysing curated protein targets: Partitioning the drugged and the druggable
Analysing curated protein targets: Partitioning the drugged and the druggable Analysing curated protein targets: Partitioning the drugged and the druggable
Analysing curated protein targets: Partitioning the drugged and the druggable
Chris Southan
 
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Chris Southan
 
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
Mandy Brown
 
ASBMB Poster_16April2014_Draft5
ASBMB Poster_16April2014_Draft5ASBMB Poster_16April2014_Draft5
ASBMB Poster_16April2014_Draft5Kaitlin Hart
 
CAFA poster presented at CSHL Genome Informatics 2013
CAFA poster presented at CSHL Genome Informatics 2013CAFA poster presented at CSHL Genome Informatics 2013
CAFA poster presented at CSHL Genome Informatics 2013Iddo
 
Cancer Res-2015-Bonastre-1287-97
Cancer Res-2015-Bonastre-1287-97Cancer Res-2015-Bonastre-1287-97
Cancer Res-2015-Bonastre-1287-97Sara Verdura
 

Similar to Will the real proteins please stand up (20)

Screening Of Mdr1 [Autosaved]
Screening Of  Mdr1 [Autosaved]Screening Of  Mdr1 [Autosaved]
Screening Of Mdr1 [Autosaved]
 
Instem-Orthologues-Handout
Instem-Orthologues-HandoutInstem-Orthologues-Handout
Instem-Orthologues-Handout
 
Identification of PFOA linked metabolic diseases by crossing databases
Identification of PFOA linked metabolic diseases by crossing databasesIdentification of PFOA linked metabolic diseases by crossing databases
Identification of PFOA linked metabolic diseases by crossing databases
 
Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212
 
MathiasHibbard_604FinalPaper
MathiasHibbard_604FinalPaperMathiasHibbard_604FinalPaper
MathiasHibbard_604FinalPaper
 
Update on the Druggable Proteome
Update on the Druggable ProteomeUpdate on the Druggable Proteome
Update on the Druggable Proteome
 
Integrative regulatory genomics for target gene prioritisation in SLE
Integrative regulatory genomics for target gene prioritisation in SLEIntegrative regulatory genomics for target gene prioritisation in SLE
Integrative regulatory genomics for target gene prioritisation in SLE
 
High similarity among ChEC-seq datasets.pdf
High similarity among ChEC-seq datasets.pdfHigh similarity among ChEC-seq datasets.pdf
High similarity among ChEC-seq datasets.pdf
 
Rehmat ullah assignment
Rehmat ullah assignmentRehmat ullah assignment
Rehmat ullah assignment
 
Integrative regulatory genomics for target gene prioritisation in SLE
Integrative regulatory genomics for target gene prioritisation in SLEIntegrative regulatory genomics for target gene prioritisation in SLE
Integrative regulatory genomics for target gene prioritisation in SLE
 
Bioinformatic jc 08_14_2013_formal
Bioinformatic jc 08_14_2013_formalBioinformatic jc 08_14_2013_formal
Bioinformatic jc 08_14_2013_formal
 
Bioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung CancerBioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung Cancer
 
Arf6 Reliability Paper - LinkedIn
Arf6 Reliability Paper - LinkedInArf6 Reliability Paper - LinkedIn
Arf6 Reliability Paper - LinkedIn
 
Systems Pharmacology as a tool for future therapy development: a feasibility ...
Systems Pharmacology as a tool for future therapy development: a feasibility ...Systems Pharmacology as a tool for future therapy development: a feasibility ...
Systems Pharmacology as a tool for future therapy development: a feasibility ...
 
Analysing curated protein targets: Partitioning the drugged and the druggable
Analysing curated protein targets: Partitioning the drugged and the druggable Analysing curated protein targets: Partitioning the drugged and the druggable
Analysing curated protein targets: Partitioning the drugged and the druggable
 
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
 
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
 
ASBMB Poster_16April2014_Draft5
ASBMB Poster_16April2014_Draft5ASBMB Poster_16April2014_Draft5
ASBMB Poster_16April2014_Draft5
 
CAFA poster presented at CSHL Genome Informatics 2013
CAFA poster presented at CSHL Genome Informatics 2013CAFA poster presented at CSHL Genome Informatics 2013
CAFA poster presented at CSHL Genome Informatics 2013
 
Cancer Res-2015-Bonastre-1287-97
Cancer Res-2015-Bonastre-1287-97Cancer Res-2015-Bonastre-1287-97
Cancer Res-2015-Bonastre-1287-97
 

More from Chris Southan

FAIR connectivity for DARCP
FAIR  connectivity for DARCPFAIR  connectivity for DARCP
FAIR connectivity for DARCP
Chris Southan
 
Connectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivityConnectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivity
Chris Southan
 
Peptide tribulations
Peptide tribulationsPeptide tribulations
Peptide tribulations
Chris Southan
 
Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2 Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2
Chris Southan
 
Guide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updaeGuide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updae
Chris Southan
 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug Development
Chris Southan
 
Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?
Chris Southan
 
Desperately seeking DARCP
Desperately seeking DARCPDesperately seeking DARCP
Desperately seeking DARCP
Chris Southan
 
Seeking glimmers of light in Pharos “Tdark” proteins
Seeking glimmers of light in  Pharos “Tdark” proteinsSeeking glimmers of light in  Pharos “Tdark” proteins
Seeking glimmers of light in Pharos “Tdark” proteins
Chris Southan
 
5HT2A modulators update for SAFER
5HT2A modulators update for SAFER5HT2A modulators update for SAFER
5HT2A modulators update for SAFER
Chris Southan
 
Quality and noise in big chemistry databases
Quality and noise in big chemistry databasesQuality and noise in big chemistry databases
Quality and noise in big chemistry databases
Chris Southan
 
Connecting chemistry-to-biology
Connecting chemistry-to-biology Connecting chemistry-to-biology
Connecting chemistry-to-biology
Chris Southan
 
GtoPdb June 2019 poster
GtoPdb June 2019 posterGtoPdb June 2019 poster
GtoPdb June 2019 poster
Chris Southan
 
PubChem as a source of systems biology perturbagens
PubChem as a source of  systems biology perturbagensPubChem as a source of  systems biology perturbagens
PubChem as a source of systems biology perturbagens
Chris Southan
 
PubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biologyPubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biology
Chris Southan
 
Peptide Tribulations
Peptide TribulationsPeptide Tribulations
Peptide Tribulations
Chris Southan
 
Looking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIRLooking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIR
Chris Southan
 
Guide to Immunopharmacology update
Guide to Immunopharmacology updateGuide to Immunopharmacology update
Guide to Immunopharmacology update
Chris Southan
 
Druggable Proteome sources in UniProt
Druggable Proteome sources in UniProtDruggable Proteome sources in UniProt
Druggable Proteome sources in UniProt
Chris Southan
 
Peptide Tribulations in GtoPdb
Peptide Tribulations in GtoPdbPeptide Tribulations in GtoPdb
Peptide Tribulations in GtoPdb
Chris Southan
 

More from Chris Southan (20)

FAIR connectivity for DARCP
FAIR  connectivity for DARCPFAIR  connectivity for DARCP
FAIR connectivity for DARCP
 
Connectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivityConnectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivity
 
Peptide tribulations
Peptide tribulationsPeptide tribulations
Peptide tribulations
 
Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2 Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2
 
Guide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updaeGuide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updae
 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug Development
 
Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?
 
Desperately seeking DARCP
Desperately seeking DARCPDesperately seeking DARCP
Desperately seeking DARCP
 
Seeking glimmers of light in Pharos “Tdark” proteins
Seeking glimmers of light in  Pharos “Tdark” proteinsSeeking glimmers of light in  Pharos “Tdark” proteins
Seeking glimmers of light in Pharos “Tdark” proteins
 
5HT2A modulators update for SAFER
5HT2A modulators update for SAFER5HT2A modulators update for SAFER
5HT2A modulators update for SAFER
 
Quality and noise in big chemistry databases
Quality and noise in big chemistry databasesQuality and noise in big chemistry databases
Quality and noise in big chemistry databases
 
Connecting chemistry-to-biology
Connecting chemistry-to-biology Connecting chemistry-to-biology
Connecting chemistry-to-biology
 
GtoPdb June 2019 poster
GtoPdb June 2019 posterGtoPdb June 2019 poster
GtoPdb June 2019 poster
 
PubChem as a source of systems biology perturbagens
PubChem as a source of  systems biology perturbagensPubChem as a source of  systems biology perturbagens
PubChem as a source of systems biology perturbagens
 
PubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biologyPubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biology
 
Peptide Tribulations
Peptide TribulationsPeptide Tribulations
Peptide Tribulations
 
Looking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIRLooking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIR
 
Guide to Immunopharmacology update
Guide to Immunopharmacology updateGuide to Immunopharmacology update
Guide to Immunopharmacology update
 
Druggable Proteome sources in UniProt
Druggable Proteome sources in UniProtDruggable Proteome sources in UniProt
Druggable Proteome sources in UniProt
 
Peptide Tribulations in GtoPdb
Peptide Tribulations in GtoPdbPeptide Tribulations in GtoPdb
Peptide Tribulations in GtoPdb
 

Recently uploaded

RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
muralinath2
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 

Recently uploaded (20)

RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 

Will the real proteins please stand up

  • 1. 3. Divergence of protein identifiers 2. Methods 7. References Will the real pharmacologically significant proteins please stand up? 1. Introduction Even in their more contemplative moments probably few pharmacologists cogitate on “so how many human proteins actually exist?” Nevertheless, on a practical level their engagement with names and identifiers (IDs) for pharmacological protein targets and disease mechanistic components is intense and includes navigating between databases and the literature. This work addresses three important aspects of protein equivocality that pharmacologists may less aware of but that we encounter head-on during curation of the IUPHAR/BPS Guide to PHARMACOLOGY [1 2]. These are: 1. Variability in canonical counts between 19,198 from the HUGO Gene Nomenclature Committee (HGNC) up to 21,341 in GeneCards, indicating a surprising annotation discordance for at least 10% of the human proteome 2. Uncertainty of alternatively spliced (AS) protein existence. While Ensembl predicts over 100,000 AS mRNAs, the verification of these by proteomics is 30-fold less than expected, inferring that the majority do not exist in vivo [3] 3. Evidence that some canonical Swiss-Prot (SP) entries are not the major isoform Using UniProt we ascertained the 4-way intersect between SP protein IDs, HGNC Gene Symbols, Ensembl genes and NCBI Gene IDs. The four sets were selected using cross-reference queries from the UniProt interface. We then accessed our internal protein statistics including the total human UniProt IDs that we had curated into GtoPdb and those for which we had annotated data-supported and pharmacologically-relevant ligand interactions. These were compared to the 4-way sequence set. We also counted proteins for which UniProt had curated splice forms using the query “Alternative splicing (KW-0025)”. We then and compared these with our ligand interaction set. We also inspected one splice form that has been annotated in GtoPdb and checked the information in SP. To address the isoform abundance question we queried the Annotation of principal and alternative splice isoforms (APPRIS) database to check targets [4]. 1. Harding SD, et al. (2018). Nucl. Acids Res. 46 (Database Issue): D1091-D1106. 2. Southan C, et al. (2018) ACS Omega 3(7), PMID: 30087946 3. Rodriguez JM et al. (2018). Nucl. Acids Res. 46 (Database Issue) D213-D217. 4. Tress ML, et al (2017) Trends Biochem Sci. 42(2):98-110. 5. Southan C (2017) F1000Res. 7;6:448. 5. Protein alternative splicing Christopher Southan, Simon D. Harding, Elena Faccenda, Adam J. Pawson and Jamie A. Davies. IUPHAR/BPS Guide to PHARMACOLOGY, Centre for Discovery Brain Sciences, University of Edinburgh, UK 6. Discussion points • In addition to AS touched on here, additional sources of protein equivocality and heterogeneity include alternative initiations and post-translational modifications. • The multiplexing of these from a (still without a consensus) canonical set of ~19,000 proteins is predicted to run into the millions. • The significance of this for pharmacology, systems biology and drug discovery is acknowledged to be high but getting solid experimental data is difficult. • GtoPdb users are welcome to alert us to potentially curatable papers on differential ligand interactions related to any forms of protein heterogeneity www.guidetopharmacology.org enquiries@guidetopharmacology.org @GuidetoPHARM 4. Comparing the consensus with GtoPdb We especially thank all contributors, collaborators and NC-IUPHAR members In the Venn diagram on the right the 4-way intersect shows that these four major global pipelines concur for less than 19,000 protein- coding genes. Most divergent is the 829 SP-only set. Inspection established many of these are categorised as pseudogenes by HGNC [5]. This surprising result includes some missing genomic cross- mappings inside SP. However, the consensus is close to the HGNC count of 19,118 (note Ensembl and NCBI reciprocally cross-map hence the empty sections) Our next step was to compare the 4- way set from the comparison above (blue) with a) all the human proteins we have entered in GtoPdb (yellow) and b) those proteins that have a curated interaction (mostly quantitative) against one or more of the 9405 ligands (green) The results were generally as expected in confirming the majority of our proteins are within the 4-way set (i.e. solidly supported). However, the analysis was valuable in detecting minor anomalies (represented in segments of 5,6 and 23). These are being followed-up but a major factor is that some of these are missing GeneID cross-references in Swiss-Prot (i.e. are blue false –ves) It is difficult to find papers with solid data showing AS affecting proteins for which we have curated ligand interactions and may thus exabit differential pharmacology. Many publications indicate that AS transcription is a) widespread, b) affects the majority of the mammalian proteome and is c) is likely to be functionally important in various biological contexts (e.g. tumours and brain tissue) even if the mechanisms are unclear. Notwithstanding, there are major uncertainties in proving the existence of AS proteins since they are difficult to verify in vivo. We approached this question by counting our interaction proteins with AS sequence variants annotated in Swiss-Prot. The results of this are shown on the right. The yellow circle indicates that 52% of human SP has at least one AS protein sequence annotated. This rises slightly to 54% in our interaction set (blue). Importantly, AS in SP is target-class specific rising to 70% for kinases but only 14% for GPCRs (since many are single--exon genes). Note that Ensembl predicts considerably more potential AS sequences than SP curates In GtoPdb we only assign quantitative and differentially-specific AS-ligand interactions if the papers meet our curatorial stringency. We also need evidence that data- supported differential binding has pharmacological significance. This is challenging for many reasons that cannot be expanded on here (but we would be pleads to discuss). Consequently, we have only one AS entry as the interaction between protein target 2903 as claudin18 and antibody ligand 9209 (below, together with the AS first exon). The specific case of claudin18 and extrapolation to other AS proteins in GtoPdb raises the question as to which sequence may be quantitatively dominant (i.e. the principle isoform in vivo). However, there are inherent challenges of quantifying AS- specific peptides by mass-spec proteomics or estimating surrogate relative abundancies from transcription data. We thus chose the APPRIS database which uses a range of computational methods fold coverage scores to select the most likely principal isoform. In this case the two SP scored equally.