Presentation on March 12, 2011 at the Skaggs School of Pharmacy and Pharmaceutical Sciences (UCSD) during the Workshop in Allosteric and Orthosteric Ligands in Drug Action
Presentation on March 12, 2011 at the Skaggs School of Pharmacy and Pharmaceutical Sciences (UCSD) during the Workshop in Allosteric and Orthosteric Ligands in Drug Action
Presentation made at PepTalk 2011 in San Diego on Jan. 13, 2011. The emphasis is on computational methods to explore global and local structure similarities in determining the possible promiscuity of drugs to bind to multiple protein receptors.
iDiffIR: Identifying differential intron retention from RNA-seqAraport
iDiffIR is a method for identifying differential intron retention from RNA-seq. For more information, please visit http://combi.cs.colostate.edu/idiffir/
The halo(gen) effect in para substituted phenyl rings - EuroCup2015Jonas Boström
One key to successfully progress a drug discovery project is to make first-rate decisions (hopefully) based on unambiguous data. This is not trivial since our scientific problems are often very complex and data can be fuzzy. In drug design we try to approach this uncertainty by being rational. It is however sometimes forgotten that our rational approaches may not be that rational after all – decisions may well be based on personal preferences and intuitive biases.... perhaps unconsciously made on biased data
Protein-protein interactions of transcription factors from the drought QTL-ho...ICRISAT
Drought is the most prominent abiotic stress that affects the productivity of chickpea.The signalling networks in response to drought comprise several major components that include transcription factors (TFs).
5 Jan 2015
Rings In (Candidate) Drugs - Case StoriesJonas Boström
Selected project impacts from over a decade of using insights from computational chemistry Focus on heterocyclic rings in candidate drugs discovered at AstraZeneca/CVMD and the strategies used in their design. The case stories will include a wide variety of examples, such as (i) replacing unwanted functional groups like acids and esters with heterocyclic rings, (ii) using rings for geometrical reasons and (iii) using heterocyclic rings to fine-tune electrostatics to obtain improved properties. In most cases the key computational approach for designing candidate drugs has been the use of shape and electrostatic comparisons between molecules. The role of luck is also discussed.
Identification of PFOA linked metabolic diseases by crossing databasesYoann Pageaud
The increasing amount of biological data makes possible their interpretation more accurate and richer than never before. Various way of representations and interpretations of the links between those data have been applied or developed consequently to these new elements which can be taken into account in diagnostics and soon in personalized medicine. The aim of this student project was to cross data coming from various databases to be able to link Perfluorooctaoic Acid (PFOA) to one or more human phenotypes and metabolic diseases. Our approach makes possible an easy and confident interpretation on the data kept and also allow us to rank diseases linked according to their risk of correlation to a specific set of proteins.
Presentation made at PepTalk 2011 in San Diego on Jan. 13, 2011. The emphasis is on computational methods to explore global and local structure similarities in determining the possible promiscuity of drugs to bind to multiple protein receptors.
iDiffIR: Identifying differential intron retention from RNA-seqAraport
iDiffIR is a method for identifying differential intron retention from RNA-seq. For more information, please visit http://combi.cs.colostate.edu/idiffir/
The halo(gen) effect in para substituted phenyl rings - EuroCup2015Jonas Boström
One key to successfully progress a drug discovery project is to make first-rate decisions (hopefully) based on unambiguous data. This is not trivial since our scientific problems are often very complex and data can be fuzzy. In drug design we try to approach this uncertainty by being rational. It is however sometimes forgotten that our rational approaches may not be that rational after all – decisions may well be based on personal preferences and intuitive biases.... perhaps unconsciously made on biased data
Protein-protein interactions of transcription factors from the drought QTL-ho...ICRISAT
Drought is the most prominent abiotic stress that affects the productivity of chickpea.The signalling networks in response to drought comprise several major components that include transcription factors (TFs).
5 Jan 2015
Rings In (Candidate) Drugs - Case StoriesJonas Boström
Selected project impacts from over a decade of using insights from computational chemistry Focus on heterocyclic rings in candidate drugs discovered at AstraZeneca/CVMD and the strategies used in their design. The case stories will include a wide variety of examples, such as (i) replacing unwanted functional groups like acids and esters with heterocyclic rings, (ii) using rings for geometrical reasons and (iii) using heterocyclic rings to fine-tune electrostatics to obtain improved properties. In most cases the key computational approach for designing candidate drugs has been the use of shape and electrostatic comparisons between molecules. The role of luck is also discussed.
Identification of PFOA linked metabolic diseases by crossing databasesYoann Pageaud
The increasing amount of biological data makes possible their interpretation more accurate and richer than never before. Various way of representations and interpretations of the links between those data have been applied or developed consequently to these new elements which can be taken into account in diagnostics and soon in personalized medicine. The aim of this student project was to cross data coming from various databases to be able to link Perfluorooctaoic Acid (PFOA) to one or more human phenotypes and metabolic diseases. Our approach makes possible an easy and confident interpretation on the data kept and also allow us to rank diseases linked according to their risk of correlation to a specific set of proteins.
ChEC-seq is a method used to identify protein-DNA interactions across a genome. It involves fusing micrococcal nuclease (MNase) to a protein of interest. In principle, specific genome- wide interactions of the fusion protein with chromatin result in local DNA cleavages that can be mapped by DNA sequencing. ChEC-seq has been used to draw conclusions about broad gene-specificities of certain protein-DNA interactions. In particular, the transcriptional regulators SAGA, TFIID, and Mediator are reported to generally occupy the promoter/UAS of genes transcribed by RNA polymerase II in yeast. Here we compare published yeast ChEC-seq data performed with a variety of protein fusions across essentially all genes, and find high similarities with negative controls. We conclude that ChEC-seq patterning for SAGA, TFIID, and Mediator differ little from background at most promoter regions, and thus cannot be used to draw conclusions about broad gene specificity of these factors.
Systems Pharmacology as a tool for future therapy development: a feasibility ...Guide to PHARMACOLOGY
Systems pharmacology has the potential to facilitate a novel range of medical interventions. Databases such as the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb, www.guidetopharmacology.org) provide information on drugs and their pharmacological effects. Combining these resources with understanding of biological systems gives us the opportunity to predict, model and quantify the effects of drug administration on whole systems. We can also ask how multiple drugs can be used together in new types of therapies that outperform conventional single target therapies.
Here, we explore the feasibility of undertaking a systems pharmacology analysis of the mevalonate branch of the cholesterol biosynthesis pathway.
Presented by Joanna Sharman at ISMB/ECCB 2015 in Dublin
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...Chris Southan
Presented by Jamie Davies at the SULSA Synthetic Biology Meeting, Edinburgh, 10 June 2014
http://www.eventbrite.co.uk/e/sulsa-synthetic-biology-meeting-registration-11251454403?aff=eorg
Abstract: Synthetic creation of new biological systems typically incorporates pathways and signaling modules from known protein building blocks. Testing the models underpinning the synthetic engineering thus needs the experimental manipulation of individual proteins, for example, ablating a specific enzyme activity via RNAi, SNP mutation, or knockout. However, the option of small-molecule inhibition as the system perturbation has the advantages of 1) rapid onset 2) dose-response 3) analog testing for structure-activity relationships, 4) exploring mixtures for combinatorial effects 5) pulsing and reversal by wash-out. 6) accurate measurements of added substances and 7) a vast precedent of published results in natural systems from medicinal chemistry, pharmacology, and chemical biology. For the synthetic biologists the GToPdb1 can thus be considered as compendium of the latter. It encompasses an interaction matrix between ~4000 small molecules and ~1000 human proteins with a focus on drugs, clinical candidates, research compounds and peptide ligands These not only have ~ 10,000 mapped binding constants but also the spectrum of documented modulation extends across enzymes, receptors, channels and transporters. It thus becomes an increasingly plausible option to choose a “Lego protein” from GToPdb as a synthetic system component that can have experimentally useable activity probes available from chemical vendors. Even if it does not currently have a suitable target-probe pair, as knowledge base (and expertise resource via the curation team who populate it) GToPdb is an ideal starting point from which to walk out to wider chemogenomic spaces. For example, while an approved drug and its target might seem a logical choice, analogs from the lead series or different chemotypes from which the drug was optimized, or even failed in development, can have superior probe-like properties for in vitro experiments (e.g. be more potent, specific and soluble). The GToPdb facilitates access to such compound data via curated papers and patents.
References
1. Pawson AJ, Sharman JL, Benson HE, Faccenda E, Alexander SP, Buneman OP, Davenport AP, McGrath JC, Peters JA, Southan C, Spedding M, Yu W, Harmar AJ; NC-IUPHAR. The IUPHAR/BPS Guide to PHARMACOLOGY: an expert-driven knowledgebase of drug targets and their ligands. Nucleic Acids Res. 2014 Jan 1;42(1)
Presented to David Gloriam's Group, Copenhagen, Feb 2020
**********************************
The theme will be presented from the perspective of both past involvement in peptide curation in the Guide to Pharmacology (GtoPdb) and in current searching for bioactive peptides in the wider ecosystem that includes ChEMBL and PubChem. The core problem is that peptides hang in limbo land between bioinformatics (BLAST) and cheminformatics (Tanimoto) neither of which provide optimal searching. Curating peptides in GtoPdb presents many challenges, including mapping endogenous peptides to Swiss-Prot cleavage annotations. For synthetic peptides, equivocal specification of modifications and exact positions of radiolabels are also problematic However, target-mapped citation-supported quantitative binding parameters are curated where possible. For those peptides falling below the PubChem CID SMILES limit of approximately 70 residues, GtoPdb has been using Sugar and Splice from NextMove Software to convert into CIDs. Specific problems associated with finding bioactive peptides in databases will be outlined.
Vicissitudes of target validation for BACE1 and BACE2 Chris Southan
Introduction/Background & Aims
The beta-amyloid (APP) cleaving enzyme (BACE1) was implicated as a drug target for Alzheimer's Disease (AD) back in 1999. In 2011, the paralogue, BACE2, became a new proposed target for type II diabetes (T2DM) having been reported to be the TMEM27 secretase regulating pancreatic beta-cell function [1]. By 2019 the accumulated evidence, including a swathe of failed clinical trials for BACE1 inhibitors, has produced a de facto de-validation of both targets in both diseases. As a learning exercise, the series of events leading up to this is reviewed here.
Method/Summary of work
Basic information about these two targets and the lead compounds against them were sourced via the IUPHAR/BPS Guide to Pharmacology (GtoPdb) as Target ids: 2330 and 2331, for BACE1 and 2, respectively. This was consolidated by a literature and patent review as well as following them in other databases. The most recent information on clinical trials was sourced from press releases.
Results/Discussion
GtoPdb annotates 24 lead compounds against BACE1 and 12 against BACE2. The corresponding counts mapped to these targets in ChEMBL are 8741 and 1377 making BACE1 one of the most actively pursued enzyme targets ever. Notwithstanding the massive global effort during 2018 Merck’s verubecestat and J&J’s atabecestat BACE1 inhibitors not only failed their Phase III endpoints but even appeared to worsen cognition in prodromal patients. In 2019 Amgen/Novartis stopped Phase II/III trials of umibecestat that also showed more cognitive decline in the treatment group compared to controls. BACE2 presented an anomalous situation in several ways. By 2016 both Novartis and Amgen declared their inability to reproduce the TMEM27 secretase turnover reported in 2011. Notwithstanding, Novartis and other companies have published patents on BACE2-specific inhibitors over several years and paradoxically verubecestat is more potent against BACE2 rather than 1 but was never tested for glucose-lowering. Equally puzzling is that one academic group is still publishing BACE2 inhibitors for T2D even post de-validation. One thing both targets have in common is the complete absence of genetic support from genome-wide disease association studies but this warning sign went unheeded.
Conclusions
The massive waste of resources on the pursuit of BACE1 as an AD target over the last two decades is catastrophic. This tale of de-validation is compounded for this paralogous pair of enzymes by the fact that the original evidence for BACE2 as a T2D target was eventually refuted. The story of these targets highlights a range of crucial pharmacological pitfalls that must be avoided in the future.
Reference(s)
[1] Southan C, Hancock J.M. (2013) A tale of two drug targets: the evolutionary history of BACE1 and BACE2. Front Genet. 4:293.
In silico 360 Analysis for Drug DevelopmentChris Southan
Introduction:
Consequent to a memorandum of understanding between the Karolinska Institutet and the International Union of Basic and Clinical Pharmacology (IUPHAR) in 2018 a report on academic drug development, including guidelines (ADEV) has been drafted [1]. As part of this exercise, we conceived a triage for comprehensive informatics profiling around the compound, target, disease axis. We have termed this “in slico 360” (INS360) the aim of which was to support ADEV teams since they may lack either internal expertise or external support to do this on their own. Indeed, some past SciLifeLab Drug Discovery and Development Platform projects had been halted because of overlooked competitive impingements or insufficient target validation evidence.
Methods
We assessed the current database landscape, mostly public but including commercial, for potential utility for INS360. We were guided primarily by content coverage, usability, and reputation. We also explored some open property prediction resources for assay interference and toxicological inferences.
Results:
As a first-stop-shop, we selected the IUPHAR/BPS Guide to PHARMACOLOGY with ~900 ligand-target relationships captured via expert curation of journal papers Moving up in scale we evaluated ChEMBL at 1.8 million compounds with 1.1 million assay descriptions and 7,000 targets. With yet another jump we could search the patent corpus with 18 million extracted compounds in SureChEMBL. We explored PubChem that integrates these three with over 500 other sources linked to 96 million compounds, BioAssay results and connectivity into the NCBI Entrez system. The final jump in scale for document-to-chemistry navigation was represented by SciFinder with 155 million structures. On the target side, 360-exploration has the need to encompass literature, structure, genetic variation, splicing, interactions, and disease pathways. From their UniProt links, both GtoPdb and ChEMBL provide these entry points. Navigating genetic association data in support of target validation was enabled by the OpenTargets portal and the GWAS Catalog. We also fount servers that could produce prediction scores from chemical structures for a range of features important for de-risking development.
Conclusion:
This work scoped out initial resource choices for the INS360. We propose that not only ADEV operations but essentially any pharmacology research team has much to gain from this approach and many potential pitfalls can consequently be avoided when approaching key checkpoints, such as preparing a publication. However, support may be needed for both institutions and teams to get the best out of these complex and feature-rich databases.
[1] Southan C, (2019) Towards Academic Drug Development Guidelines, ChemRxiv pre-print no. 8869574
Will the correct BACE ORFs please stand up?Chris Southan
BACE1 and BACE2 are protease targets for Alzheimer's and diabetes, respectively but their validation is now questioned
Phylogenetic analysis can added functional insights
This came up against two key problems
A surprising prevalence of incorrect protein sequences predicted from genomes
Many BACE1 and BACE2 orthologues had truncation and/or indel errors.
Key phylogenetic representative genomes are languishing in an unfinished state
Some options for amelioration of these problems will be described
An update on the evolution of these enzymes will be shown
Look for new and potentially useful human 5HT2A-directed small molecule chemistry surfaced since the last meeting., check for compounds against as 5HT2A primary target but also combined inhibitors, poll round the key databases, literature and patents, earching challenges arise from synonym soup, complex cross-reactivities (see PMID 29679900) in vitro data gaps and in vivo polypharmacology
Quality and noise in big chemistry databasesChris Southan
Presented at Aug 2019 ACS by Antony Williams. Abstract: The internet has changed the way we access chemistry data as well as providing access to data that can quickly proliferate and becomes referenceable. Web access to chemical structures and their integration with biological data has become massively enabling with numbers for UniChem, PubChem and ChemSpider reaching 157, 97 and 71 million respectively (at the time of writing). A range of specialist databases small enough to be curated have stand-alone utility and synergies when integrated into the larger collections. These include DrugBank, BindingDB, ChEBI, and many others. Databases of any size have inherent quality challenges but at large scale various forms of “noise” accumulate to problematic levels. The unfortunate consequence is that “bigger gets worse”. This is particularly associated with large uncurated submissions from vendors and automated document extractions (even though these are high-value). Virtual enumerations and circularity between overlapping sources add to the problem. As a result of some of the noise in the larger databases the value becomes highly dependent on the specific applications. An example includes using the databases to support non-targeted analysis. This presentation covers examples of these noise and quality issues and suggests at least some options to ameliorate the problem
Progress in drug discovery and chemical biology is hugely enabled by curated document-assay-result-compound-target relationships (D-A-R-C-P) in open databases from resources such as the Guide to Pharmacology and ChEMBL. These are synergistically integrated into PubChem which pre-computes chemical similarity and connectivity between over 95 million structures and 5.6 million BioAssay results. It also links chemistry to documents via various additional routes including MeSH and large scale submissions from publishers. However, these efforts are patchy and very few journals facilitate such connectivity. There thus remains a massive shortfall in public D-A-R-C-P capture from decades of papers and patents. This presentation will cover these aspects and discuss their partial amelioration by options such as author-driven depositions and open lab-book approaches as used by Open Source Malaria
Looking at chemistry - protein - papers connectivity in ELIXIRChris Southan
This is a poster for the UK ELXIR meetin in Birmingham UK, Nov 2018. It is the summary of a blog-post https://cdsouthan.blogspot.com/2018/08/an-initial-look-at-elixir-chemistry.html that asses chemistry <> protein <> papers connectivity (C-P-P) for five ELIXIR resources
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
1. 3. Divergence of protein identifiers
2. Methods
7. References
Will the real pharmacologically significant
proteins please stand up?
1. Introduction
Even in their more contemplative moments probably few pharmacologists cogitate on
“so how many human proteins actually exist?” Nevertheless, on a practical level their
engagement with names and identifiers (IDs) for pharmacological protein targets and
disease mechanistic components is intense and includes navigating between
databases and the literature. This work addresses three important aspects of protein
equivocality that pharmacologists may less aware of but that we encounter head-on
during curation of the IUPHAR/BPS Guide to PHARMACOLOGY [1 2]. These are:
1. Variability in canonical counts between 19,198 from the HUGO Gene Nomenclature
Committee (HGNC) up to 21,341 in GeneCards, indicating a surprising annotation
discordance for at least 10% of the human proteome
2. Uncertainty of alternatively spliced (AS) protein existence. While Ensembl predicts
over 100,000 AS mRNAs, the verification of these by proteomics is 30-fold less than
expected, inferring that the majority do not exist in vivo [3]
3. Evidence that some canonical Swiss-Prot (SP) entries are not the major isoform
Using UniProt we ascertained the 4-way intersect between SP protein IDs, HGNC Gene
Symbols, Ensembl genes and NCBI Gene IDs. The four sets were selected using
cross-reference queries from the UniProt interface. We then accessed our internal
protein statistics including the total human UniProt IDs that we had curated into GtoPdb
and those for which we had annotated data-supported and pharmacologically-relevant
ligand interactions. These were compared to the 4-way sequence set. We also counted
proteins for which UniProt had curated splice forms using the query “Alternative splicing
(KW-0025)”. We then and compared these with our ligand interaction set. We also
inspected one splice form that has been annotated in GtoPdb and checked the
information in SP. To address the isoform abundance question we queried the
Annotation of principal and alternative splice isoforms (APPRIS) database to check
targets [4].
1. Harding SD, et al. (2018). Nucl. Acids Res. 46 (Database Issue): D1091-D1106.
2. Southan C, et al. (2018) ACS Omega 3(7), PMID: 30087946
3. Rodriguez JM et al. (2018). Nucl. Acids Res. 46 (Database Issue) D213-D217.
4. Tress ML, et al (2017) Trends Biochem Sci. 42(2):98-110.
5. Southan C (2017) F1000Res. 7;6:448.
5. Protein alternative splicing
Christopher Southan, Simon D. Harding, Elena Faccenda, Adam J. Pawson and Jamie A. Davies.
IUPHAR/BPS Guide to PHARMACOLOGY, Centre for Discovery Brain Sciences, University of Edinburgh, UK
6. Discussion points
• In addition to AS touched on here, additional sources of protein equivocality and
heterogeneity include alternative initiations and post-translational modifications.
• The multiplexing of these from a (still without a consensus) canonical set of ~19,000
proteins is predicted to run into the millions.
• The significance of this for pharmacology, systems biology and drug discovery is
acknowledged to be high but getting solid experimental data is difficult.
• GtoPdb users are welcome to alert us to potentially curatable papers on differential
ligand interactions related to any forms of protein heterogeneity
www.guidetopharmacology.org enquiries@guidetopharmacology.org @GuidetoPHARM
4. Comparing the consensus with GtoPdb
We especially thank all contributors, collaborators and NC-IUPHAR members
In the Venn diagram on the
right the 4-way intersect
shows that these four major
global pipelines concur for
less than 19,000 protein-
coding genes. Most divergent
is the 829 SP-only set.
Inspection established many
of these are categorised as
pseudogenes by HGNC [5].
This surprising result includes
some missing genomic cross-
mappings inside SP. However,
the consensus is close to the
HGNC count of 19,118 (note
Ensembl and NCBI
reciprocally cross-map hence
the empty sections)
Our next step was to compare the 4-
way set from the comparison above
(blue) with a) all the human proteins we
have entered in GtoPdb (yellow) and
b) those proteins that have a curated
interaction (mostly quantitative) against
one or more of the 9405 ligands (green)
The results were generally as expected
in confirming the majority of our proteins
are within the 4-way set (i.e. solidly
supported). However, the analysis was
valuable in detecting minor anomalies
(represented in segments of 5,6 and
23). These are being followed-up but a
major factor is that some of these are
missing GeneID cross-references in
Swiss-Prot (i.e. are blue false –ves)
It is difficult to find papers with solid data showing AS affecting proteins for which we
have curated ligand interactions and may thus exabit differential pharmacology. Many
publications indicate that AS transcription is a) widespread, b) affects the majority of the
mammalian proteome and is c) is likely to be functionally important in various biological
contexts (e.g. tumours and brain tissue) even if the mechanisms are unclear.
Notwithstanding, there are major uncertainties in proving the existence of AS proteins
since they are difficult to verify in vivo. We approached this question by counting our
interaction proteins with AS sequence variants annotated in Swiss-Prot.
The results of this are shown on
the right. The yellow circle
indicates that 52% of human SP
has at least one AS protein
sequence annotated. This rises
slightly to 54% in our interaction
set (blue). Importantly, AS in SP
is target-class specific rising to
70% for kinases but only 14%
for GPCRs (since many are
single--exon genes). Note that
Ensembl predicts considerably
more potential AS sequences
than SP curates
In GtoPdb we only assign quantitative and differentially-specific AS-ligand interactions
if the papers meet our curatorial stringency. We also need evidence that data-
supported differential binding has pharmacological significance. This is challenging for
many reasons that cannot be expanded on here (but we would be pleads to discuss).
Consequently, we have only one AS entry as the interaction between protein target
2903 as claudin18 and antibody ligand 9209 (below, together with the AS first exon).
The specific case of claudin18 and extrapolation to other AS proteins in GtoPdb
raises the question as to which sequence may be quantitatively dominant (i.e. the
principle isoform in vivo). However, there are inherent challenges of quantifying AS-
specific peptides by mass-spec proteomics or estimating surrogate relative
abundancies from transcription data. We thus chose the APPRIS database which
uses a range of computational methods fold coverage scores to select the most
likely principal isoform. In this case the two SP scored equally.