The document discusses the content of ligands from the IUPHAR/BPS Guide to PHARMACOLOGY database (GtoPdb) that is contained within PubChem. It finds that GtoPdb ligands have extensive overlap with several other sources within PubChem, including patents, DrugBank, vendor structures, bioassays, and ChEMBL. This overlap allows users to find additional information on GtoPdb ligands from these complementary sources within PubChem.
GtoPdb: A resource for cell-based perturbogensChris Southan
Poster for ELRIG, Möndal, 11/12 May 2017.
This poster will also be presented at BioITWorld, Boston, May 23-25
A resource for the selection and interpretation of cell-based perturbogens: the IUPHAR/BPS Guide to PHARMACOLOGY
Christopher Southan, Elena Faccenda, Joanna L. Sharman, Adam J. Pawson, Simon D. Harding, Jamie A Davies,
Translational research requires the integration of the in vitro molecular mechanisms of action (mmoa) of small molecules, cell-based screening studies, animal models and eventual clinical trials. The International Union of Pharmacology (IUPHAR)/British Pharmacology Society (BPS) database, GtoPdb http://www.guidetopharmacology.org/ provides expert-annotated molecular interactions between endogenous receptor ligands, probes, lead compounds, clinical drugs and their protein targets. It thus provides a core set of quantitative pharmacological relationships that can be interrogated for many purposes, including those running cell-based screens, not only during result interpretation but also to identify key compounds for scoping and consolidation experiments. As described in [1] GtoPdb is populated by records extracted from pharmacology and medicinal chemistry journals, and released quarterly. Quality is ensured by curatorial stringency and our unique model of content selection based on recommendations from IUPHAR target class subcommittees of international experts collaborating with the in-house curators. The database now has over 14 000 binding values (mainly IC50, Ki or Kd) between 8000 ligands and 15000 human proteins (mainly primary but also secondary off-target interactions) representing a 7% druggable proteome. Our coverage is complementary to other sources. For example the 6565 structures we recently submitted to PubChem as CIDs, 5206 were not in DrugBank and 1535 not in ChEMBL. This includes recommended tool compounds with relatively defined mmoa (including 110 from the Structural Genomics Consortium Probe Portal). We also have 75% overlap with vendors for procurement and 80% with patent extractions that in many cases allow mapping to SAR data sets from first-filings (some of which we point to). In a cell screening context 1254 of our targets intersect with proteins in the Reactome pathway database. This is one way to select chemical peturbation points that could be detected by assay readouts. From Nov 2015 we have been funded by the Wellcome Trust to extend into immunopharmacology (within the existing database schema) that is now driving overall GtoPdb content expansion. Parties engaged in cell based assays using or could use compounds we have are encouraged to use GtoPdb, contact us for queries, possible analogue expansions and/or alert us to prospective new content. [1] Southan C et. al. (2016) Nucleic Acids Res. 44(D1):D1054-68, PMID: 26464438
Sorting bioactive wheat from database chaffChris Southan
Abstract
Databases of bioactive compounds are crucial for pharmacology, drug discovery and chemical genomics as public sources approach ~ 100 million records. However, in recent years this famine-to-feast presents difficulties for searching chemical structures and linked activity data, particularly for those unfamiliar with the constitutive challenges of molecular representation in silico (PMID 25415348). A key problem is entries of structural variants of “the same thing” as pharmacological entities (i.e. representational multiplexing). For example, a 2009 comparison of three database subsets of ~1200 approved drugs recorded only 807 structures in-common (PMID 20298516). In addition, published counts of approved drugs vary widely. These issues have been continually encountered by the Guide to PHARMACOLOGY database (GtoPdb) team that, since 2009, has achieved the curation of ~5500 small molecules (including approved drugs) from papers. Concomitantly, we have noticed an increase in multiplexing as PubChem pushes towards 65 million compound identifiers (CIDs). Since one of our key objectives is to affinity-map ligands to their targets, we decided to assess this multiplexing problem in order to optimise our curation rules. The results have implications for the entire bioactivity information space. We began by compiling CID sets for seven different sources within PubChem encompassing approved drugs. Initially a 7x7 pairwise comparison matrix indicated low overlap between these sources. A Venn diagram was then made from the approved drug CIDs mapped by DrugBank, Therapeutic Target Database and ChEMBL. At 749, the three-way intersect was less than 35% of the union of all CIDs covered by the sets. Strikingly, this looks worse that the 2009 study (although the sources and comparison methods were different). We will present further analyses that go some way towards explaining these results. One of these is determining “same connectivity” statistics inside PubChem as a measure of multiplexing. For DrugBank, each approved drug was related to, on average, 19 different CIDs as structural variants. Analysis of multiplexing confirmed trends we had observed during individual drug curation. This included ~ 30% stereoisomer enumerations but, surprisingly, ~70% isotopic derivatives, dominated by patent-derived virtual deuteration. We also established the ratio of submissions (SIDs) to CIDs was 78. The paradox was that, despite this high “majority vote” support for approved drug CIDs curated by DrugBank, only 55% were in the 3-way consensus (figures for the other two curated sources were similar). Analysing by year in PubChem indicated how the recent expansion of vendor and patent-extraction structures contributes to both multiplexing and the SID: CID ratio. While approved drugs are strongly impacted, associated problems, such as split activity data and deciding the “correct” structures, affect essentially all public drug discovery chem
Sorting bioactive wheat from database chaff: Challenges of discerning correct...Guide to PHARMACOLOGY
Since 2009 the Guide to PHARMACOLOGY database (GtoPdb) team have curated 7586 ligands from papers, including approved drugs, clinical candidates , research compounds peptides and clinical antibodies (PMID 24234439). As PubChem pushes towards 70 million compound identifiers (CIDs), we have noticed the problem
of “multiplexing” during the curation of 5713 small molecules as CIDs. we encountered many representations (i.e. different CIDs) of the same pharmacological entities. Three types of variation dominate: stereochemistry, mixtures and isotopic analogues. These are known constitutive issues for chemical databases but in
recent years we observed this multiplexing was reaching
problematic proportions (i.e. more chaff), especially for clinically used drugs (i.e. proportionally less wheat)
Searching for chemical information using PubChemSunghwan Kim
Presented at the 257th American Chemical Society (ACS) National Meeting in Orlando, FL (April 1, 2019). [CHED 303]
==== Abstract ====
PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public chemical database, which provides information on a broad range of chemical entities, including small molecules, lipids, carbohydrates, and (chemically-modified) amino acid and nucleic acid sequences (including siRNA and miRNA). With three million unique users per month at peak, PubChem is ranked as one of the most visited chemistry websites in the world. A substantial number of PubChem users are between ages 18 and 24, who are likely to be undergraduate or graduate students at academic institutions. Therefore, PubChem has a great potential as an online resource for chemical education. In this talk, we will present “PubChem Search”, a new web interface that allows users to quickly find desired chemical information. This interface supports chemical name search as well as various types of chemical structure search, including identity/similarity search, superstructure/substructure search, and molecular search. Using PubChem Search, it is also possible to search for journal articles or patent documents that mention a given chemical. The hits returned from a search can be downloaded to local machines or further refined or analyzed in conjunction with other PubChem tools and services. In this presentation, we will demonstrate how the PubChem Search interface can be used to search beyond google for chemical information of interest.
GtoPdb: A resource for cell-based perturbogensChris Southan
Poster for ELRIG, Möndal, 11/12 May 2017.
This poster will also be presented at BioITWorld, Boston, May 23-25
A resource for the selection and interpretation of cell-based perturbogens: the IUPHAR/BPS Guide to PHARMACOLOGY
Christopher Southan, Elena Faccenda, Joanna L. Sharman, Adam J. Pawson, Simon D. Harding, Jamie A Davies,
Translational research requires the integration of the in vitro molecular mechanisms of action (mmoa) of small molecules, cell-based screening studies, animal models and eventual clinical trials. The International Union of Pharmacology (IUPHAR)/British Pharmacology Society (BPS) database, GtoPdb http://www.guidetopharmacology.org/ provides expert-annotated molecular interactions between endogenous receptor ligands, probes, lead compounds, clinical drugs and their protein targets. It thus provides a core set of quantitative pharmacological relationships that can be interrogated for many purposes, including those running cell-based screens, not only during result interpretation but also to identify key compounds for scoping and consolidation experiments. As described in [1] GtoPdb is populated by records extracted from pharmacology and medicinal chemistry journals, and released quarterly. Quality is ensured by curatorial stringency and our unique model of content selection based on recommendations from IUPHAR target class subcommittees of international experts collaborating with the in-house curators. The database now has over 14 000 binding values (mainly IC50, Ki or Kd) between 8000 ligands and 15000 human proteins (mainly primary but also secondary off-target interactions) representing a 7% druggable proteome. Our coverage is complementary to other sources. For example the 6565 structures we recently submitted to PubChem as CIDs, 5206 were not in DrugBank and 1535 not in ChEMBL. This includes recommended tool compounds with relatively defined mmoa (including 110 from the Structural Genomics Consortium Probe Portal). We also have 75% overlap with vendors for procurement and 80% with patent extractions that in many cases allow mapping to SAR data sets from first-filings (some of which we point to). In a cell screening context 1254 of our targets intersect with proteins in the Reactome pathway database. This is one way to select chemical peturbation points that could be detected by assay readouts. From Nov 2015 we have been funded by the Wellcome Trust to extend into immunopharmacology (within the existing database schema) that is now driving overall GtoPdb content expansion. Parties engaged in cell based assays using or could use compounds we have are encouraged to use GtoPdb, contact us for queries, possible analogue expansions and/or alert us to prospective new content. [1] Southan C et. al. (2016) Nucleic Acids Res. 44(D1):D1054-68, PMID: 26464438
Sorting bioactive wheat from database chaffChris Southan
Abstract
Databases of bioactive compounds are crucial for pharmacology, drug discovery and chemical genomics as public sources approach ~ 100 million records. However, in recent years this famine-to-feast presents difficulties for searching chemical structures and linked activity data, particularly for those unfamiliar with the constitutive challenges of molecular representation in silico (PMID 25415348). A key problem is entries of structural variants of “the same thing” as pharmacological entities (i.e. representational multiplexing). For example, a 2009 comparison of three database subsets of ~1200 approved drugs recorded only 807 structures in-common (PMID 20298516). In addition, published counts of approved drugs vary widely. These issues have been continually encountered by the Guide to PHARMACOLOGY database (GtoPdb) team that, since 2009, has achieved the curation of ~5500 small molecules (including approved drugs) from papers. Concomitantly, we have noticed an increase in multiplexing as PubChem pushes towards 65 million compound identifiers (CIDs). Since one of our key objectives is to affinity-map ligands to their targets, we decided to assess this multiplexing problem in order to optimise our curation rules. The results have implications for the entire bioactivity information space. We began by compiling CID sets for seven different sources within PubChem encompassing approved drugs. Initially a 7x7 pairwise comparison matrix indicated low overlap between these sources. A Venn diagram was then made from the approved drug CIDs mapped by DrugBank, Therapeutic Target Database and ChEMBL. At 749, the three-way intersect was less than 35% of the union of all CIDs covered by the sets. Strikingly, this looks worse that the 2009 study (although the sources and comparison methods were different). We will present further analyses that go some way towards explaining these results. One of these is determining “same connectivity” statistics inside PubChem as a measure of multiplexing. For DrugBank, each approved drug was related to, on average, 19 different CIDs as structural variants. Analysis of multiplexing confirmed trends we had observed during individual drug curation. This included ~ 30% stereoisomer enumerations but, surprisingly, ~70% isotopic derivatives, dominated by patent-derived virtual deuteration. We also established the ratio of submissions (SIDs) to CIDs was 78. The paradox was that, despite this high “majority vote” support for approved drug CIDs curated by DrugBank, only 55% were in the 3-way consensus (figures for the other two curated sources were similar). Analysing by year in PubChem indicated how the recent expansion of vendor and patent-extraction structures contributes to both multiplexing and the SID: CID ratio. While approved drugs are strongly impacted, associated problems, such as split activity data and deciding the “correct” structures, affect essentially all public drug discovery chem
Sorting bioactive wheat from database chaff: Challenges of discerning correct...Guide to PHARMACOLOGY
Since 2009 the Guide to PHARMACOLOGY database (GtoPdb) team have curated 7586 ligands from papers, including approved drugs, clinical candidates , research compounds peptides and clinical antibodies (PMID 24234439). As PubChem pushes towards 70 million compound identifiers (CIDs), we have noticed the problem
of “multiplexing” during the curation of 5713 small molecules as CIDs. we encountered many representations (i.e. different CIDs) of the same pharmacological entities. Three types of variation dominate: stereochemistry, mixtures and isotopic analogues. These are known constitutive issues for chemical databases but in
recent years we observed this multiplexing was reaching
problematic proportions (i.e. more chaff), especially for clinically used drugs (i.e. proportionally less wheat)
Searching for chemical information using PubChemSunghwan Kim
Presented at the 257th American Chemical Society (ACS) National Meeting in Orlando, FL (April 1, 2019). [CHED 303]
==== Abstract ====
PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public chemical database, which provides information on a broad range of chemical entities, including small molecules, lipids, carbohydrates, and (chemically-modified) amino acid and nucleic acid sequences (including siRNA and miRNA). With three million unique users per month at peak, PubChem is ranked as one of the most visited chemistry websites in the world. A substantial number of PubChem users are between ages 18 and 24, who are likely to be undergraduate or graduate students at academic institutions. Therefore, PubChem has a great potential as an online resource for chemical education. In this talk, we will present “PubChem Search”, a new web interface that allows users to quickly find desired chemical information. This interface supports chemical name search as well as various types of chemical structure search, including identity/similarity search, superstructure/substructure search, and molecular search. Using PubChem Search, it is also possible to search for journal articles or patent documents that mention a given chemical. The hits returned from a search can be downloaded to local machines or further refined or analyzed in conjunction with other PubChem tools and services. In this presentation, we will demonstrate how the PubChem Search interface can be used to search beyond google for chemical information of interest.
The internet now offers access to a myriad of online resources that can be of value to chemists working in the Life Sciences. While finding information online is, in many cases, a simple search away, the accuracy and validity of the associated data and information should be questioned. As more databases and resources are introduced online, and commonly not integrated to other resources, a scientist must perform multiple searches and then undertake the task of meshing and merging data. ChemSpider is a freely accessible online database that has taken on the challenge of meshing together distributed resources across the internet to provide a structure-based hub. It is a crowdsourcing environment hosting over 26 million unique compounds linked out to over 400 data sources. With well defined programming interfaces for integration ChemSpider has been integrated to many commercial and open software packages and is presently serving as the chemistry foundation for the IMI Open PHACTS project.
This is a presentation given at the Opal Events meeting ""Drug Discovery Partnerships: Filling the Pipeline". I was speaking in a session with Jean-Claude Bradley regarding "Pre-competitive Collaboration: Sharing Data to Increase Predictability". This presentation discussed some of the work we are doing on Open PHACTS. My thanks especially to Carole Goble, Lee Harland and Sean Ekins for their comments.
Poster titled "The imperative of small, high quality data for underpinning big data: the IUPHAR/BPS Guide to PHARMACOLOGY". Presented by Dr. Christopher Southan, at the British Society of Pharmacology, Institute for Translational Medicine & Therapeutics (ITMAT) Meeting, Edinburgh, March 2017, ‘Big Data & the Development of New Medicines’.
Slicing and dicing curated protein targets: Analysing the drugged, druggable ...Guide to PHARMACOLOGY
Presented by team member Chris Southan in April 2015 at the BPS Focused meeting in Edinburgh: Exploiting the new pharmacology and application to drug discovery.
The IUPHAR/BPS Guide to PHARAMCOLOGY in 2018: new features and updatesGuide to PHARMACOLOGY
2018 update poster for the IUPHAR/BPS Guide to PHARMACOLOGY. Giving details of new features and updates. To be presented at Pharmacology Futures, Edinburgh, May 2018; ELIXIR-All Hands, Berlin, June 2018 and World Congress of Pharmacology, Kyoto, Japan, July 2018
Metabolite Set Enrichment Analysis (ChemRICH)Dinesh Barupal
Metabolomics answers a fundamental question in biology: How does metabolism respond to genetic, environmental or phenotypic perturbations? Combining several metabolomics assays can yield datasets for more than 800 structurally identified metabolites. However, biological interpretations of metabolic regulation in these datasets are hindered by inherent limits of pathway enrichment statistics. We have developed ChemRICH, a statistical enrichment approach that is based on chemical similarity rather than sparse biochemical knowledge annotations. ChemRICH utilizes structure similarity and chemical ontologies to map all known metabolites and name metabolic modules. Unlike pathway mapping, this strategy yields study-specific, non-overlapping sets of all identified metabolites. Subsequent enrichment statistics is superior to pathway enrichments because ChemRICH sets have a self-contained size where p-values do not rely on the size of a background database. We demonstrate ChemRICH’s efficiency on a public metabolomics data set discerning the development of type 1 diabetes in a non-obese diabetic mouse model. ChemRICH is available at www.chemrich.fiehnlab.ucdavis.edu
The iCSS CompTox Chemistry Dashboard is a publicly accessible dashboard provided by the National Center for Computation Toxicology at the US-EPA. It serves a number of purposes, including providing a chemistry database underpinning many of our public-facing projects (e.g. ToxCast and ExpoCast). The available data and searches provide a valuable path to structure identification using mass spectrometry as the source data. With an underlying database of over 720,000 chemicals, the dashboard has already been used to assist in identifying chemicals present in house dust. This poster reviews the benefits of the EPA’s platform and underlying algorithms used for the purpose of compound identification using high-resolution mass spectrometry data. Standard approaches for both mass and formula lookup are available but the dashboard delivers a novel approach for hit ranking based on functional use of the chemicals. The focus on high-quality data, novel ranking approaches and integration to other resources of value to mass spectrometrists makes the CompTox Dashboard a valuable resource for the identification of environmental chemicals. This abstract does not reflect U.S. EPA policy.
The internet now offers access to a myriad of online resources that can be of value to chemists working in the Life Sciences. While finding information online is, in many cases, a simple search away, the accuracy and validity of the associated data and information should be questioned. As more databases and resources are introduced online, and commonly not integrated to other resources, a scientist must perform multiple searches and then undertake the task of meshing and merging data. ChemSpider is a freely accessible online database that has taken on the challenge of meshing together distributed resources across the internet to provide a structure-based hub. It is a crowdsourcing environment hosting over 26 million unique compounds linked out to over 400 data sources. With well defined programming interfaces for integration ChemSpider has been integrated to many commercial and open software packages and is presently serving as the chemistry foundation for the IMI Open PHACTS project.
This is a presentation given at the Opal Events meeting ""Drug Discovery Partnerships: Filling the Pipeline". I was speaking in a session with Jean-Claude Bradley regarding "Pre-competitive Collaboration: Sharing Data to Increase Predictability". This presentation discussed some of the work we are doing on Open PHACTS. My thanks especially to Carole Goble, Lee Harland and Sean Ekins for their comments.
Poster titled "The imperative of small, high quality data for underpinning big data: the IUPHAR/BPS Guide to PHARMACOLOGY". Presented by Dr. Christopher Southan, at the British Society of Pharmacology, Institute for Translational Medicine & Therapeutics (ITMAT) Meeting, Edinburgh, March 2017, ‘Big Data & the Development of New Medicines’.
Slicing and dicing curated protein targets: Analysing the drugged, druggable ...Guide to PHARMACOLOGY
Presented by team member Chris Southan in April 2015 at the BPS Focused meeting in Edinburgh: Exploiting the new pharmacology and application to drug discovery.
The IUPHAR/BPS Guide to PHARAMCOLOGY in 2018: new features and updatesGuide to PHARMACOLOGY
2018 update poster for the IUPHAR/BPS Guide to PHARMACOLOGY. Giving details of new features and updates. To be presented at Pharmacology Futures, Edinburgh, May 2018; ELIXIR-All Hands, Berlin, June 2018 and World Congress of Pharmacology, Kyoto, Japan, July 2018
Metabolite Set Enrichment Analysis (ChemRICH)Dinesh Barupal
Metabolomics answers a fundamental question in biology: How does metabolism respond to genetic, environmental or phenotypic perturbations? Combining several metabolomics assays can yield datasets for more than 800 structurally identified metabolites. However, biological interpretations of metabolic regulation in these datasets are hindered by inherent limits of pathway enrichment statistics. We have developed ChemRICH, a statistical enrichment approach that is based on chemical similarity rather than sparse biochemical knowledge annotations. ChemRICH utilizes structure similarity and chemical ontologies to map all known metabolites and name metabolic modules. Unlike pathway mapping, this strategy yields study-specific, non-overlapping sets of all identified metabolites. Subsequent enrichment statistics is superior to pathway enrichments because ChemRICH sets have a self-contained size where p-values do not rely on the size of a background database. We demonstrate ChemRICH’s efficiency on a public metabolomics data set discerning the development of type 1 diabetes in a non-obese diabetic mouse model. ChemRICH is available at www.chemrich.fiehnlab.ucdavis.edu
The iCSS CompTox Chemistry Dashboard is a publicly accessible dashboard provided by the National Center for Computation Toxicology at the US-EPA. It serves a number of purposes, including providing a chemistry database underpinning many of our public-facing projects (e.g. ToxCast and ExpoCast). The available data and searches provide a valuable path to structure identification using mass spectrometry as the source data. With an underlying database of over 720,000 chemicals, the dashboard has already been used to assist in identifying chemicals present in house dust. This poster reviews the benefits of the EPA’s platform and underlying algorithms used for the purpose of compound identification using high-resolution mass spectrometry data. Standard approaches for both mass and formula lookup are available but the dashboard delivers a novel approach for hit ranking based on functional use of the chemicals. The focus on high-quality data, novel ranking approaches and integration to other resources of value to mass spectrometrists makes the CompTox Dashboard a valuable resource for the identification of environmental chemicals. This abstract does not reflect U.S. EPA policy.
Presentación que recorre las principales fases de la guerra, las batallas decisivas, los grandes acuerdos de los vencedores. El holocausto judío y gitano, la represión japonesa en Asia y las represalias aliadas que culminan con el lanzamiento de dos bombas atómicas sobre Japón
Poster presented at the Elixir All-Hands Meeting in Lisbon, June 2019. Gives a broad summary of Guide to Pharmacology activities in the last year. Emphasising new tools and our extension into malaria pharmacology.
A general poster about the IUPHAR/BPS Guide to PHARMACOLOGY, updated for 2017. This works well used as a handout or pinned on departmental noticeboards.
Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at ...Chris Southan
Talk given at the Paris NC-IUPHAR meeting, Paris, October 2013
ChEMBL, DrugBank, Human Metabolome Database and the Therapeutic Target Database are resources of curated chemistry-to-protein relationships widely used in the chemogenomic arena. In this work we have extended an earlier analysis (PMID 22821596) by comparing chemistry and protein target content between 2010 and 2013. For the former, details are presented for overlaps and differences, statistics of stereochemistry as well as stereo representation and MW profiles between the four databases. For 2013 our results indicate quality improvements, major expansion, increased achiral structures and changes in MW distributions. An orthogonal comparison of chemical content with different sources inside PubChem highlights further interpretable differences. Expansion of protein content by UniProt IDs is also recorded for 2013 and Gene Ontology comparisons for human-only sets indicate differences. These emphasise the expanding complementarity of chemistry-to-protein relationships between sources, although different criteria are used for their capture.
These slides will be presented at the Pharmacology 2017 meeting in London during the following session:
Abstract Number: OB073
Abstract Title: Capturing new BIA 10-2474 molecular data in the IUPHAR/BPS Guide to PHARMACOLOGY
Date: Wednesday, December 13, 2017, 11:30 AM
Oral Session: Oral Communications: Mixed Tracks
Presented to David Gloriam's Group, Copenhagen, Feb 2020
**********************************
The theme will be presented from the perspective of both past involvement in peptide curation in the Guide to Pharmacology (GtoPdb) and in current searching for bioactive peptides in the wider ecosystem that includes ChEMBL and PubChem. The core problem is that peptides hang in limbo land between bioinformatics (BLAST) and cheminformatics (Tanimoto) neither of which provide optimal searching. Curating peptides in GtoPdb presents many challenges, including mapping endogenous peptides to Swiss-Prot cleavage annotations. For synthetic peptides, equivocal specification of modifications and exact positions of radiolabels are also problematic However, target-mapped citation-supported quantitative binding parameters are curated where possible. For those peptides falling below the PubChem CID SMILES limit of approximately 70 residues, GtoPdb has been using Sugar and Splice from NextMove Software to convert into CIDs. Specific problems associated with finding bioactive peptides in databases will be outlined.
Vicissitudes of target validation for BACE1 and BACE2 Chris Southan
Introduction/Background & Aims
The beta-amyloid (APP) cleaving enzyme (BACE1) was implicated as a drug target for Alzheimer's Disease (AD) back in 1999. In 2011, the paralogue, BACE2, became a new proposed target for type II diabetes (T2DM) having been reported to be the TMEM27 secretase regulating pancreatic beta-cell function [1]. By 2019 the accumulated evidence, including a swathe of failed clinical trials for BACE1 inhibitors, has produced a de facto de-validation of both targets in both diseases. As a learning exercise, the series of events leading up to this is reviewed here.
Method/Summary of work
Basic information about these two targets and the lead compounds against them were sourced via the IUPHAR/BPS Guide to Pharmacology (GtoPdb) as Target ids: 2330 and 2331, for BACE1 and 2, respectively. This was consolidated by a literature and patent review as well as following them in other databases. The most recent information on clinical trials was sourced from press releases.
Results/Discussion
GtoPdb annotates 24 lead compounds against BACE1 and 12 against BACE2. The corresponding counts mapped to these targets in ChEMBL are 8741 and 1377 making BACE1 one of the most actively pursued enzyme targets ever. Notwithstanding the massive global effort during 2018 Merck’s verubecestat and J&J’s atabecestat BACE1 inhibitors not only failed their Phase III endpoints but even appeared to worsen cognition in prodromal patients. In 2019 Amgen/Novartis stopped Phase II/III trials of umibecestat that also showed more cognitive decline in the treatment group compared to controls. BACE2 presented an anomalous situation in several ways. By 2016 both Novartis and Amgen declared their inability to reproduce the TMEM27 secretase turnover reported in 2011. Notwithstanding, Novartis and other companies have published patents on BACE2-specific inhibitors over several years and paradoxically verubecestat is more potent against BACE2 rather than 1 but was never tested for glucose-lowering. Equally puzzling is that one academic group is still publishing BACE2 inhibitors for T2D even post de-validation. One thing both targets have in common is the complete absence of genetic support from genome-wide disease association studies but this warning sign went unheeded.
Conclusions
The massive waste of resources on the pursuit of BACE1 as an AD target over the last two decades is catastrophic. This tale of de-validation is compounded for this paralogous pair of enzymes by the fact that the original evidence for BACE2 as a T2D target was eventually refuted. The story of these targets highlights a range of crucial pharmacological pitfalls that must be avoided in the future.
Reference(s)
[1] Southan C, Hancock J.M. (2013) A tale of two drug targets: the evolutionary history of BACE1 and BACE2. Front Genet. 4:293.
In silico 360 Analysis for Drug DevelopmentChris Southan
Introduction:
Consequent to a memorandum of understanding between the Karolinska Institutet and the International Union of Basic and Clinical Pharmacology (IUPHAR) in 2018 a report on academic drug development, including guidelines (ADEV) has been drafted [1]. As part of this exercise, we conceived a triage for comprehensive informatics profiling around the compound, target, disease axis. We have termed this “in slico 360” (INS360) the aim of which was to support ADEV teams since they may lack either internal expertise or external support to do this on their own. Indeed, some past SciLifeLab Drug Discovery and Development Platform projects had been halted because of overlooked competitive impingements or insufficient target validation evidence.
Methods
We assessed the current database landscape, mostly public but including commercial, for potential utility for INS360. We were guided primarily by content coverage, usability, and reputation. We also explored some open property prediction resources for assay interference and toxicological inferences.
Results:
As a first-stop-shop, we selected the IUPHAR/BPS Guide to PHARMACOLOGY with ~900 ligand-target relationships captured via expert curation of journal papers Moving up in scale we evaluated ChEMBL at 1.8 million compounds with 1.1 million assay descriptions and 7,000 targets. With yet another jump we could search the patent corpus with 18 million extracted compounds in SureChEMBL. We explored PubChem that integrates these three with over 500 other sources linked to 96 million compounds, BioAssay results and connectivity into the NCBI Entrez system. The final jump in scale for document-to-chemistry navigation was represented by SciFinder with 155 million structures. On the target side, 360-exploration has the need to encompass literature, structure, genetic variation, splicing, interactions, and disease pathways. From their UniProt links, both GtoPdb and ChEMBL provide these entry points. Navigating genetic association data in support of target validation was enabled by the OpenTargets portal and the GWAS Catalog. We also fount servers that could produce prediction scores from chemical structures for a range of features important for de-risking development.
Conclusion:
This work scoped out initial resource choices for the INS360. We propose that not only ADEV operations but essentially any pharmacology research team has much to gain from this approach and many potential pitfalls can consequently be avoided when approaching key checkpoints, such as preparing a publication. However, support may be needed for both institutions and teams to get the best out of these complex and feature-rich databases.
[1] Southan C, (2019) Towards Academic Drug Development Guidelines, ChemRxiv pre-print no. 8869574
Will the correct BACE ORFs please stand up?Chris Southan
BACE1 and BACE2 are protease targets for Alzheimer's and diabetes, respectively but their validation is now questioned
Phylogenetic analysis can added functional insights
This came up against two key problems
A surprising prevalence of incorrect protein sequences predicted from genomes
Many BACE1 and BACE2 orthologues had truncation and/or indel errors.
Key phylogenetic representative genomes are languishing in an unfinished state
Some options for amelioration of these problems will be described
An update on the evolution of these enzymes will be shown
Look for new and potentially useful human 5HT2A-directed small molecule chemistry surfaced since the last meeting., check for compounds against as 5HT2A primary target but also combined inhibitors, poll round the key databases, literature and patents, earching challenges arise from synonym soup, complex cross-reactivities (see PMID 29679900) in vitro data gaps and in vivo polypharmacology
Quality and noise in big chemistry databasesChris Southan
Presented at Aug 2019 ACS by Antony Williams. Abstract: The internet has changed the way we access chemistry data as well as providing access to data that can quickly proliferate and becomes referenceable. Web access to chemical structures and their integration with biological data has become massively enabling with numbers for UniChem, PubChem and ChemSpider reaching 157, 97 and 71 million respectively (at the time of writing). A range of specialist databases small enough to be curated have stand-alone utility and synergies when integrated into the larger collections. These include DrugBank, BindingDB, ChEBI, and many others. Databases of any size have inherent quality challenges but at large scale various forms of “noise” accumulate to problematic levels. The unfortunate consequence is that “bigger gets worse”. This is particularly associated with large uncurated submissions from vendors and automated document extractions (even though these are high-value). Virtual enumerations and circularity between overlapping sources add to the problem. As a result of some of the noise in the larger databases the value becomes highly dependent on the specific applications. An example includes using the databases to support non-targeted analysis. This presentation covers examples of these noise and quality issues and suggests at least some options to ameliorate the problem
Progress in drug discovery and chemical biology is hugely enabled by curated document-assay-result-compound-target relationships (D-A-R-C-P) in open databases from resources such as the Guide to Pharmacology and ChEMBL. These are synergistically integrated into PubChem which pre-computes chemical similarity and connectivity between over 95 million structures and 5.6 million BioAssay results. It also links chemistry to documents via various additional routes including MeSH and large scale submissions from publishers. However, these efforts are patchy and very few journals facilitate such connectivity. There thus remains a massive shortfall in public D-A-R-C-P capture from decades of papers and patents. This presentation will cover these aspects and discuss their partial amelioration by options such as author-driven depositions and open lab-book approaches as used by Open Source Malaria
Looking at chemistry - protein - papers connectivity in ELIXIRChris Southan
This is a poster for the UK ELXIR meetin in Birmingham UK, Nov 2018. It is the summary of a blog-post https://cdsouthan.blogspot.com/2018/08/an-initial-look-at-elixir-chemistry.html that asses chemistry <> protein <> papers connectivity (C-P-P) for five ELIXIR resources
Poster for World Congres of Pharmacology 2018, Kyoto
Introduction: The pharmacological literature and patents connect compound structures to their bioactivity. However, entombing these relationships for millions of compounds among millions of PDFs is acknowledged as massively problematic. The situation is ameliorated by resources that extract the entity and data relationships the authors and inventors put “in” to their PDFs back “out” into structured database records. The IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb) has been doing this by stringent curation of ligands and their quantitative activity against protein targets [1]. Our citations are submitted to PubChem (PC), who then link to PubMed (PM) [2]. This study presents an overview of this connectivity.
Methods: For GtoPdb entries in PC Substance we used the PC interface to count our submitted PM links. This gives the PC > PM mapping counts from which we analysed the PM links. We then performed reciprocal analyses (i.e. PM > PC) by selecting PM sets. We then compared two journals by counting structure links by year and source.
Results: From 8988 GtoPdb-submitted ligand substances in PC (release 2017.5), 7309 are linked to 8980 PM entries. Of the 7309 there are 5632 links to chemical structures in PC the rest being antibodies and larger peptides. From the 8980 PMIDs, the Journal of Medicinal Chemistry (JMC) accounted for 1003 as our most frequently cited primary source of structure-to-activity mappings. For the British Journal of Pharmacology (BJP) most of the 345 cross-references were development compounds. Further analysis showed that from 2014 to 2017 the BJP to PC links of ~ 30 structures per year are mostly from GtoPdb and the Comparative Toxicology Database. However, going back to 2010-12, this increased to 500-800 connections, mainly derived from the IBM automated chemical extraction from abstracts. A similar pattern was observed for JMC.
Conclusion: Navigation between documents and databases is an essential competence for pharmacologists and drug discovery but the NCBI Entrez system is daunting. GtoPdb is a major contributor of high-quality links and provides a first-stop to guide users into the PC/PM systems. However, our results indicated potentially serious specificity issues with automated chemistry-to-journal linking from non-GtoPdb sources.
References: [1] Harding et al. (2018). Nucl. Acids Res. 45 (Database Issue), doi: 10.1093/nar/gkx1121.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Assessing GtoPdb ligand content in PubChem
1. Christopher Southan, Elena Faccenda, Simon J. Harding, Joanna L. Sharman, Adam J. Pawson, and Jamie A
Davies, Centre for Integrative Physiology, The University of Edinburgh, EH8 9XD UK,
www.guidetopharmacology.org http://www.slideshare.net/cdsouthan/assessing-gtopdb-ligand-content-in-pubchem
Assessing the IUPHAR/BPS Guide to
PHARMACOLOGY ligand content in PubChem
INTRODUCTION
The utilities of these intersects are outlined below (in order of counts):
• CNER refers to “Chemical Named Entity Recognition” for the automated extraction of chemistry from patents by sources submitting to PubChem (of
which SureChEMBL is the largest at 16.3 million). This means that users can track-back most of our ligands to early patent filings that can often include
more SAR than eventually appeared in the papers.
• Our low overlap with DrugBank indicates both sources are complementary in bioactive compound selection (i.e. the OR union is 12605)
• The possibility of sourcing purchasable compounds is important for experimental pharmacologists. From the 64 million vendor structures in PubChem
we have nearly an 80% overlap and similarity searches may pick up analogues where there is no exact match.
• The “BioAssay active” tag overlaps extensively with ChEMBL entries but users can check for a range of activities for a ligand that maybe additional to
the values we have extracted from selected papers.
• The MeSH term “pharmacological action” is useful but our impression is that NLM is falling behind in the PubChem indexing of this term.
• PDB ligand structures are valued database cross-references for many reasons.
• We have introduced a new feature that allows users to retrieve just our 1291 approved drug SID entries (Query “approved[Comment] AND
"IUPHAR/BPS Guide to PHARMACOLOGY"[SourceName]”). The “PubChem Same Compound” select then generates 1174 small-molecule CIDs. This
facilitates different types of comparative analysis between drug lists.
• As expected, our overlap with ChEMBL structures is high but we have captured 1147 structures not in this source, mainly due to different journal capture
and shorter release cycles.
• The selection “unique to GtoPdb” indicates those CIDs where we are the only source in the whole of PubChem. These are predominantly novel
structures we have extracted from papers but in some cases we have selected a different structure from other sources.
• There may be interest in which pharmacologically active peptides we have CIDs for. A simple Mw-cut isolates 178 entries
Further details related to intersects above are given this GtoPdb blog post https://blog.guidetopharmacology.org/2016/10/31/gtopdb-ligands-in-pubchem/.
This post about PubChem sources in general may also be of interest https://cdsouthan.blogspot.se/2016/06/pubchem-source-of-month.html.
Reference[1]: “The IUPHAR/BPS Guide to PHARMACOLOGY in 2016: towards curated quantitative interactions between 1300 protein targets and 6000
ligands”. Southan et al, Nucleic Acids Research, 2016 Jan 4;44(D1): Database Issue, D1054-68, PMID: 2646443
The International Union of Basic and Clinical Pharmacology and British
Pharmacological Society (UPHAR/BPS) Guide to PHARMACOLOGY
database (GtoPdb) and its precursor IUPHAR-DB have been capturing
the structures of pharmacologically relevant ligands since 2005 [1].
The snapshot on the right shows our eight-category ligand
classification. As an active collaboration with the PubChem team, we
have submitted our ligand records for every GtoPdb release since
2012. For release 2016.4 (October) the query ("IUPHAR/BPS Guide
to PHARMACOLOGY"[SourceName]) retrieves 8674 Substance
Identifiers (SIDs) and 6565 Compound Identifiers (CIDs). The excess
of 2109 SIDs is accounted for by antibodies, small proteins and larger
peptides that cannot form CIDs. At just over 92 million CIDs covering
473 sources, a range of property filters and full Boolean operations for
combining query sets, PubChem provides an opportunity to “slice and
dice” our ligand set in comparative and informative ways. Just a small
set of example results is shown below.
RESULTS
Supported by