Neglected infectious diseases such as tuberculosis (TB) and malaria kill millions of people annually and the oral drugs used are subject to resistance requiring the urgent development of new therapeutics. Several groups, including pharmaceutical companies, have made large sets of antimalarial screening hit compounds and the associated bioassay data available for the community to learn from and potentially optimize. We have examined both intrinsic and predicted molecular properties across these datasets and compared them with large libraries of compounds screened against Mycobacterium tuberculosis in order to identify any obvious patterns, trends or relationships. One set of antimalarial hits provided by GlaxoSmithKline appears less optimal for lead optimization compared with two other sets of screening hits we examined. Active compounds against both diseases were identified to have larger molecular weight ([similar]350–400) and logP values of [similar]4.0, values that are, in general, distinct from the less active compounds. The antimalarial hits were also filtered with computational rules to identify potentially undesirable substructures. We were surprised that approximately 75–85% of these compounds failed one of the sets of filters that we applied during this work. The level of filter failure was much higher than for FDA approved drugs or a subset of antimalarial drugs. Both antimalarial and antituberculosis drug discovery should likely use simple available approaches to ensure that the hits derived from large scale screening are worth optimizing and do not clearly represent reactive compounds with a higher probability of toxicity in vivo.
Poster on the IUPHAR/MMV Guide to Malaria Pharmacology presented by Dr. Jane F. Armstrong at the EMBL BioMalPar XV: Pathology of the Malaria Parasite, EMBL Heidelberg, Germany May 2019
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...Chris Southan
Presented by Jamie Davies at the SULSA Synthetic Biology Meeting, Edinburgh, 10 June 2014
http://www.eventbrite.co.uk/e/sulsa-synthetic-biology-meeting-registration-11251454403?aff=eorg
Abstract: Synthetic creation of new biological systems typically incorporates pathways and signaling modules from known protein building blocks. Testing the models underpinning the synthetic engineering thus needs the experimental manipulation of individual proteins, for example, ablating a specific enzyme activity via RNAi, SNP mutation, or knockout. However, the option of small-molecule inhibition as the system perturbation has the advantages of 1) rapid onset 2) dose-response 3) analog testing for structure-activity relationships, 4) exploring mixtures for combinatorial effects 5) pulsing and reversal by wash-out. 6) accurate measurements of added substances and 7) a vast precedent of published results in natural systems from medicinal chemistry, pharmacology, and chemical biology. For the synthetic biologists the GToPdb1 can thus be considered as compendium of the latter. It encompasses an interaction matrix between ~4000 small molecules and ~1000 human proteins with a focus on drugs, clinical candidates, research compounds and peptide ligands These not only have ~ 10,000 mapped binding constants but also the spectrum of documented modulation extends across enzymes, receptors, channels and transporters. It thus becomes an increasingly plausible option to choose a “Lego protein” from GToPdb as a synthetic system component that can have experimentally useable activity probes available from chemical vendors. Even if it does not currently have a suitable target-probe pair, as knowledge base (and expertise resource via the curation team who populate it) GToPdb is an ideal starting point from which to walk out to wider chemogenomic spaces. For example, while an approved drug and its target might seem a logical choice, analogs from the lead series or different chemotypes from which the drug was optimized, or even failed in development, can have superior probe-like properties for in vitro experiments (e.g. be more potent, specific and soluble). The GToPdb facilitates access to such compound data via curated papers and patents.
References
1. Pawson AJ, Sharman JL, Benson HE, Faccenda E, Alexander SP, Buneman OP, Davenport AP, McGrath JC, Peters JA, Southan C, Spedding M, Yu W, Harmar AJ; NC-IUPHAR. The IUPHAR/BPS Guide to PHARMACOLOGY: an expert-driven knowledgebase of drug targets and their ligands. Nucleic Acids Res. 2014 Jan 1;42(1)
Poster on the IUPHAR/MMV Guide to Malaria Pharmacology presented by Dr. Jane F. Armstrong at the EMBL BioMalPar XV: Pathology of the Malaria Parasite, EMBL Heidelberg, Germany May 2019
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...Chris Southan
Presented by Jamie Davies at the SULSA Synthetic Biology Meeting, Edinburgh, 10 June 2014
http://www.eventbrite.co.uk/e/sulsa-synthetic-biology-meeting-registration-11251454403?aff=eorg
Abstract: Synthetic creation of new biological systems typically incorporates pathways and signaling modules from known protein building blocks. Testing the models underpinning the synthetic engineering thus needs the experimental manipulation of individual proteins, for example, ablating a specific enzyme activity via RNAi, SNP mutation, or knockout. However, the option of small-molecule inhibition as the system perturbation has the advantages of 1) rapid onset 2) dose-response 3) analog testing for structure-activity relationships, 4) exploring mixtures for combinatorial effects 5) pulsing and reversal by wash-out. 6) accurate measurements of added substances and 7) a vast precedent of published results in natural systems from medicinal chemistry, pharmacology, and chemical biology. For the synthetic biologists the GToPdb1 can thus be considered as compendium of the latter. It encompasses an interaction matrix between ~4000 small molecules and ~1000 human proteins with a focus on drugs, clinical candidates, research compounds and peptide ligands These not only have ~ 10,000 mapped binding constants but also the spectrum of documented modulation extends across enzymes, receptors, channels and transporters. It thus becomes an increasingly plausible option to choose a “Lego protein” from GToPdb as a synthetic system component that can have experimentally useable activity probes available from chemical vendors. Even if it does not currently have a suitable target-probe pair, as knowledge base (and expertise resource via the curation team who populate it) GToPdb is an ideal starting point from which to walk out to wider chemogenomic spaces. For example, while an approved drug and its target might seem a logical choice, analogs from the lead series or different chemotypes from which the drug was optimized, or even failed in development, can have superior probe-like properties for in vitro experiments (e.g. be more potent, specific and soluble). The GToPdb facilitates access to such compound data via curated papers and patents.
References
1. Pawson AJ, Sharman JL, Benson HE, Faccenda E, Alexander SP, Buneman OP, Davenport AP, McGrath JC, Peters JA, Southan C, Spedding M, Yu W, Harmar AJ; NC-IUPHAR. The IUPHAR/BPS Guide to PHARMACOLOGY: an expert-driven knowledgebase of drug targets and their ligands. Nucleic Acids Res. 2014 Jan 1;42(1)
Sorting bioactive wheat from database chaffChris Southan
Abstract
Databases of bioactive compounds are crucial for pharmacology, drug discovery and chemical genomics as public sources approach ~ 100 million records. However, in recent years this famine-to-feast presents difficulties for searching chemical structures and linked activity data, particularly for those unfamiliar with the constitutive challenges of molecular representation in silico (PMID 25415348). A key problem is entries of structural variants of “the same thing” as pharmacological entities (i.e. representational multiplexing). For example, a 2009 comparison of three database subsets of ~1200 approved drugs recorded only 807 structures in-common (PMID 20298516). In addition, published counts of approved drugs vary widely. These issues have been continually encountered by the Guide to PHARMACOLOGY database (GtoPdb) team that, since 2009, has achieved the curation of ~5500 small molecules (including approved drugs) from papers. Concomitantly, we have noticed an increase in multiplexing as PubChem pushes towards 65 million compound identifiers (CIDs). Since one of our key objectives is to affinity-map ligands to their targets, we decided to assess this multiplexing problem in order to optimise our curation rules. The results have implications for the entire bioactivity information space. We began by compiling CID sets for seven different sources within PubChem encompassing approved drugs. Initially a 7x7 pairwise comparison matrix indicated low overlap between these sources. A Venn diagram was then made from the approved drug CIDs mapped by DrugBank, Therapeutic Target Database and ChEMBL. At 749, the three-way intersect was less than 35% of the union of all CIDs covered by the sets. Strikingly, this looks worse that the 2009 study (although the sources and comparison methods were different). We will present further analyses that go some way towards explaining these results. One of these is determining “same connectivity” statistics inside PubChem as a measure of multiplexing. For DrugBank, each approved drug was related to, on average, 19 different CIDs as structural variants. Analysis of multiplexing confirmed trends we had observed during individual drug curation. This included ~ 30% stereoisomer enumerations but, surprisingly, ~70% isotopic derivatives, dominated by patent-derived virtual deuteration. We also established the ratio of submissions (SIDs) to CIDs was 78. The paradox was that, despite this high “majority vote” support for approved drug CIDs curated by DrugBank, only 55% were in the 3-way consensus (figures for the other two curated sources were similar). Analysing by year in PubChem indicated how the recent expansion of vendor and patent-extraction structures contributes to both multiplexing and the SID: CID ratio. While approved drugs are strongly impacted, associated problems, such as split activity data and deciding the “correct” structures, affect essentially all public drug discovery chem
Sorting bioactive wheat from database chaff: Challenges of discerning correct...Guide to PHARMACOLOGY
Since 2009 the Guide to PHARMACOLOGY database (GtoPdb) team have curated 7586 ligands from papers, including approved drugs, clinical candidates , research compounds peptides and clinical antibodies (PMID 24234439). As PubChem pushes towards 70 million compound identifiers (CIDs), we have noticed the problem
of “multiplexing” during the curation of 5713 small molecules as CIDs. we encountered many representations (i.e. different CIDs) of the same pharmacological entities. Three types of variation dominate: stereochemistry, mixtures and isotopic analogues. These are known constitutive issues for chemical databases but in
recent years we observed this multiplexing was reaching
problematic proportions (i.e. more chaff), especially for clinically used drugs (i.e. proportionally less wheat)
To accomplish a desired systemic effect, drug molecules must reach the systemic circulation after extravascular administration. The percent of the taken dose that reaches intact to the systemic circulation is called “bioavailability, BA”. Absolute Bioavailability compares the BA of the active drug in systemic circulation following non-intravenous administration
Poster on GtoImmuPdb presented at European Congress of Immunology (Amsterdam, Sep 2018). Overview of the main data types and features included in this extension to the IUPHAR/BPS Guide to PHARMACOLOGY
Searching for chemical information using PubChemSunghwan Kim
Presented at the 257th American Chemical Society (ACS) National Meeting in Orlando, FL (April 1, 2019). [CHED 303]
==== Abstract ====
PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public chemical database, which provides information on a broad range of chemical entities, including small molecules, lipids, carbohydrates, and (chemically-modified) amino acid and nucleic acid sequences (including siRNA and miRNA). With three million unique users per month at peak, PubChem is ranked as one of the most visited chemistry websites in the world. A substantial number of PubChem users are between ages 18 and 24, who are likely to be undergraduate or graduate students at academic institutions. Therefore, PubChem has a great potential as an online resource for chemical education. In this talk, we will present “PubChem Search”, a new web interface that allows users to quickly find desired chemical information. This interface supports chemical name search as well as various types of chemical structure search, including identity/similarity search, superstructure/substructure search, and molecular search. Using PubChem Search, it is also possible to search for journal articles or patent documents that mention a given chemical. The hits returned from a search can be downloaded to local machines or further refined or analyzed in conjunction with other PubChem tools and services. In this presentation, we will demonstrate how the PubChem Search interface can be used to search beyond google for chemical information of interest.
QSAR Modeling of Bisbenzofuran Compounds using 2D-Descriptors as Antimalarial...ijtsrd
In the present study we have performed Quantitative structure activity relationship (QSAR) analysis for 43bisbenzofuran derivatives to estimate the antimalarial activity using some 2D descriptors. Several significant QSAR models has been calculated for predicting the antimalarial activity (“logIC50) of these molecules by using the multiple linear regression (MLR) technique. Among the obtained QSAR models, a four parametric model was most significant having R2=0.9502. An external set was used for confirming the predictive power of the models. High correlation between experimental and predicted antimalarial activity values, was obtained in the validation approach that displayed the good modality of the derived QSAR models. Tripti Kaushal | Anita K | Bashirulla Shaik | Vijay K. Agrawal"QSAR Modeling of Bisbenzofuran Compounds using 2D-Descriptors as Antimalarial Agents" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-2 , February 2018, URL: http://www.ijtsrd.com/papers/ijtsrd9497.pdf http://www.ijtsrd.com/chemistry/other/9497/qsar-modeling-of-bisbenzofuran-compounds-using-2d-descriptors-as-antimalarial-agents/tripti-kaushal
With the recent announcement that GlaxoSmithKline have released a huge tranche of whole cell malaria screening data to the public domain, accompanied by a corresponding publication, this raises some issues for consideration before this exemplar instance becomes a trend. We have examined the data from a high level, by studying the molecular properties, and consider the various alerts presently in use by major pharma companies. We acknowledge the potential value of such data but also raise the issue of the actual value of such datasets released into the public domain. We also suggest approaches that could enhance the value of such datasets to the community and theoretically offer more immediate benefit to the search for leads for other neglected diseases.
Poster presented at the Elixir All-Hands Meeting in Lisbon, June 2019. Gives a broad summary of Guide to Pharmacology activities in the last year. Emphasising new tools and our extension into malaria pharmacology.
The Role of Bioinformatics in The Drug Discovery ProcessAdebowale Qazeem
The Role of Bioinformatics in The Drug Discovery Process, is an undergraduate seminar presentation in the department of Biochemistry, Faculty of life Sciences, University of Ilorin, Ilorin.
Non-targeted screening, targeted screening and suspect screening, as well as “Known Unknowns” and “Unknown Unknowns” are now common terms in the field of water analysis. While data processing can be highly automated, the identification of chemicals from extracted masses, formulae or fragmentation utilizes reference spectral libraries or identification and ranking of tentative candidate lists from large structure libraries. The US-EPA CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard) provides access to data for ~875,000 substances, searchable by mass and formula and then ranked using associated meta-data. Cheminformatics approaches are also utilized to provide mapped relationships between individual substances and their “MS-Ready” (desalted, non-stereospecific) forms. This presentation will review how the freely available Dashboard application can be utilized to support structure identification using mass spectrometry and our efforts to enhance the application using computational spectral fragmentation and access to predicted data such as relative retention time index values and computed collision cross section values to support ion mobility spectrometry. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.
Neglected and rare diseases traditionally have not been the focus of large pharmaceutical company research as biotech and academia have primarily been involved in drug discovery efforts for such diseases. This area certainly represents a new opportunity as the pharmaceutical industry investigates new markets. One approach to speed up drug discovery is to examine new uses for existing approved drugs; this is termed drug repositioning or drug repurposing and has become increasingly popular in recent years. Analysis of the literature reveals that using high-throughput screening there have been many examples of FDA approved drugs found to be active against additional targets that can be used to therapeutic advantage for repositioning for other diseases. To date there are far fewer such examples where in silico approaches have allowed for the derivation of new uses. It is suggested that with current technologies and databases of chemical compounds (drugs) and related data, as well as close integration with in vitro screening data, improved opportunities for drug repurposing will emerge. In this publication a review of the literature will highlight several proof of principle examples from areas such as finding new inhibitors for drug transporters with 3D pharmacophores and uncovering molecules active against Mycobacterium tuberculosis (Mtb) using Bayesian models of compound libraries. Research into neglected or rare/orphan diseases can likely benefit from in silico drug repositioning approaches and accelerate drug discovery for these diseases.
Sorting bioactive wheat from database chaffChris Southan
Abstract
Databases of bioactive compounds are crucial for pharmacology, drug discovery and chemical genomics as public sources approach ~ 100 million records. However, in recent years this famine-to-feast presents difficulties for searching chemical structures and linked activity data, particularly for those unfamiliar with the constitutive challenges of molecular representation in silico (PMID 25415348). A key problem is entries of structural variants of “the same thing” as pharmacological entities (i.e. representational multiplexing). For example, a 2009 comparison of three database subsets of ~1200 approved drugs recorded only 807 structures in-common (PMID 20298516). In addition, published counts of approved drugs vary widely. These issues have been continually encountered by the Guide to PHARMACOLOGY database (GtoPdb) team that, since 2009, has achieved the curation of ~5500 small molecules (including approved drugs) from papers. Concomitantly, we have noticed an increase in multiplexing as PubChem pushes towards 65 million compound identifiers (CIDs). Since one of our key objectives is to affinity-map ligands to their targets, we decided to assess this multiplexing problem in order to optimise our curation rules. The results have implications for the entire bioactivity information space. We began by compiling CID sets for seven different sources within PubChem encompassing approved drugs. Initially a 7x7 pairwise comparison matrix indicated low overlap between these sources. A Venn diagram was then made from the approved drug CIDs mapped by DrugBank, Therapeutic Target Database and ChEMBL. At 749, the three-way intersect was less than 35% of the union of all CIDs covered by the sets. Strikingly, this looks worse that the 2009 study (although the sources and comparison methods were different). We will present further analyses that go some way towards explaining these results. One of these is determining “same connectivity” statistics inside PubChem as a measure of multiplexing. For DrugBank, each approved drug was related to, on average, 19 different CIDs as structural variants. Analysis of multiplexing confirmed trends we had observed during individual drug curation. This included ~ 30% stereoisomer enumerations but, surprisingly, ~70% isotopic derivatives, dominated by patent-derived virtual deuteration. We also established the ratio of submissions (SIDs) to CIDs was 78. The paradox was that, despite this high “majority vote” support for approved drug CIDs curated by DrugBank, only 55% were in the 3-way consensus (figures for the other two curated sources were similar). Analysing by year in PubChem indicated how the recent expansion of vendor and patent-extraction structures contributes to both multiplexing and the SID: CID ratio. While approved drugs are strongly impacted, associated problems, such as split activity data and deciding the “correct” structures, affect essentially all public drug discovery chem
Sorting bioactive wheat from database chaff: Challenges of discerning correct...Guide to PHARMACOLOGY
Since 2009 the Guide to PHARMACOLOGY database (GtoPdb) team have curated 7586 ligands from papers, including approved drugs, clinical candidates , research compounds peptides and clinical antibodies (PMID 24234439). As PubChem pushes towards 70 million compound identifiers (CIDs), we have noticed the problem
of “multiplexing” during the curation of 5713 small molecules as CIDs. we encountered many representations (i.e. different CIDs) of the same pharmacological entities. Three types of variation dominate: stereochemistry, mixtures and isotopic analogues. These are known constitutive issues for chemical databases but in
recent years we observed this multiplexing was reaching
problematic proportions (i.e. more chaff), especially for clinically used drugs (i.e. proportionally less wheat)
To accomplish a desired systemic effect, drug molecules must reach the systemic circulation after extravascular administration. The percent of the taken dose that reaches intact to the systemic circulation is called “bioavailability, BA”. Absolute Bioavailability compares the BA of the active drug in systemic circulation following non-intravenous administration
Poster on GtoImmuPdb presented at European Congress of Immunology (Amsterdam, Sep 2018). Overview of the main data types and features included in this extension to the IUPHAR/BPS Guide to PHARMACOLOGY
Searching for chemical information using PubChemSunghwan Kim
Presented at the 257th American Chemical Society (ACS) National Meeting in Orlando, FL (April 1, 2019). [CHED 303]
==== Abstract ====
PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public chemical database, which provides information on a broad range of chemical entities, including small molecules, lipids, carbohydrates, and (chemically-modified) amino acid and nucleic acid sequences (including siRNA and miRNA). With three million unique users per month at peak, PubChem is ranked as one of the most visited chemistry websites in the world. A substantial number of PubChem users are between ages 18 and 24, who are likely to be undergraduate or graduate students at academic institutions. Therefore, PubChem has a great potential as an online resource for chemical education. In this talk, we will present “PubChem Search”, a new web interface that allows users to quickly find desired chemical information. This interface supports chemical name search as well as various types of chemical structure search, including identity/similarity search, superstructure/substructure search, and molecular search. Using PubChem Search, it is also possible to search for journal articles or patent documents that mention a given chemical. The hits returned from a search can be downloaded to local machines or further refined or analyzed in conjunction with other PubChem tools and services. In this presentation, we will demonstrate how the PubChem Search interface can be used to search beyond google for chemical information of interest.
QSAR Modeling of Bisbenzofuran Compounds using 2D-Descriptors as Antimalarial...ijtsrd
In the present study we have performed Quantitative structure activity relationship (QSAR) analysis for 43bisbenzofuran derivatives to estimate the antimalarial activity using some 2D descriptors. Several significant QSAR models has been calculated for predicting the antimalarial activity (“logIC50) of these molecules by using the multiple linear regression (MLR) technique. Among the obtained QSAR models, a four parametric model was most significant having R2=0.9502. An external set was used for confirming the predictive power of the models. High correlation between experimental and predicted antimalarial activity values, was obtained in the validation approach that displayed the good modality of the derived QSAR models. Tripti Kaushal | Anita K | Bashirulla Shaik | Vijay K. Agrawal"QSAR Modeling of Bisbenzofuran Compounds using 2D-Descriptors as Antimalarial Agents" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-2 , February 2018, URL: http://www.ijtsrd.com/papers/ijtsrd9497.pdf http://www.ijtsrd.com/chemistry/other/9497/qsar-modeling-of-bisbenzofuran-compounds-using-2d-descriptors-as-antimalarial-agents/tripti-kaushal
With the recent announcement that GlaxoSmithKline have released a huge tranche of whole cell malaria screening data to the public domain, accompanied by a corresponding publication, this raises some issues for consideration before this exemplar instance becomes a trend. We have examined the data from a high level, by studying the molecular properties, and consider the various alerts presently in use by major pharma companies. We acknowledge the potential value of such data but also raise the issue of the actual value of such datasets released into the public domain. We also suggest approaches that could enhance the value of such datasets to the community and theoretically offer more immediate benefit to the search for leads for other neglected diseases.
Poster presented at the Elixir All-Hands Meeting in Lisbon, June 2019. Gives a broad summary of Guide to Pharmacology activities in the last year. Emphasising new tools and our extension into malaria pharmacology.
The Role of Bioinformatics in The Drug Discovery ProcessAdebowale Qazeem
The Role of Bioinformatics in The Drug Discovery Process, is an undergraduate seminar presentation in the department of Biochemistry, Faculty of life Sciences, University of Ilorin, Ilorin.
Non-targeted screening, targeted screening and suspect screening, as well as “Known Unknowns” and “Unknown Unknowns” are now common terms in the field of water analysis. While data processing can be highly automated, the identification of chemicals from extracted masses, formulae or fragmentation utilizes reference spectral libraries or identification and ranking of tentative candidate lists from large structure libraries. The US-EPA CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard) provides access to data for ~875,000 substances, searchable by mass and formula and then ranked using associated meta-data. Cheminformatics approaches are also utilized to provide mapped relationships between individual substances and their “MS-Ready” (desalted, non-stereospecific) forms. This presentation will review how the freely available Dashboard application can be utilized to support structure identification using mass spectrometry and our efforts to enhance the application using computational spectral fragmentation and access to predicted data such as relative retention time index values and computed collision cross section values to support ion mobility spectrometry. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.
Neglected and rare diseases traditionally have not been the focus of large pharmaceutical company research as biotech and academia have primarily been involved in drug discovery efforts for such diseases. This area certainly represents a new opportunity as the pharmaceutical industry investigates new markets. One approach to speed up drug discovery is to examine new uses for existing approved drugs; this is termed drug repositioning or drug repurposing and has become increasingly popular in recent years. Analysis of the literature reveals that using high-throughput screening there have been many examples of FDA approved drugs found to be active against additional targets that can be used to therapeutic advantage for repositioning for other diseases. To date there are far fewer such examples where in silico approaches have allowed for the derivation of new uses. It is suggested that with current technologies and databases of chemical compounds (drugs) and related data, as well as close integration with in vitro screening data, improved opportunities for drug repurposing will emerge. In this publication a review of the literature will highlight several proof of principle examples from areas such as finding new inhibitors for drug transporters with 3D pharmacophores and uncovering molecules active against Mycobacterium tuberculosis (Mtb) using Bayesian models of compound libraries. Research into neglected or rare/orphan diseases can likely benefit from in silico drug repositioning approaches and accelerate drug discovery for these diseases.
In silico Drug Design: Prospective for Drug Lead Discoveryinventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Identifying Antibiotics posing potential Health Risk: Microbial Resistance Sc...Atai Rabby
The present study was undertaken to investigate the trends of antimicrobial resistance and identify antibiotics that are posing public health risk due to resistant microbes in Bangladesh. Antimicrobial resistance data of Bangladesh for last 10 years were searched out and compared with corresponding antibiotic consumption rates. In this study, a factor is introduced to identify the therapeutic sub-class of antibiotics that are mostly threatened by growing antimicrobial resistance. Highly resistance trend against several antibiotics such as cloxacillin, ampicillin, metronidazole, oxacillin, amoxicillin, tetracycline, cotrimoxazole, penicillin etc. were also indentified. Heat map analysis of this study revealed that nine antimicrobial agents: metronidazole, amoxicillin, tetracycline, cotrimoxazole, cephadine, penicillin, ciprofloxacin, doxycycline and nalidixic acid are associated with public health risk due to growing bacterial resistance. This study would significantly contribute in minimizing development and spread of antibiotic resistance by revealing the microbial resistance scenario and aid the effective antibiotic treatment options in Bangladesh.
Drug discovery and development is and always has been the most exciting part of clinical pharmacology. It is my attempt to compile the basic concepts from various books, articles and online journals. Feel free to comment.
Research trends in different pharmaceutical areas: Natural product chemistry
Imtiaj Hossain Chowdhury
B’Pharm (Jahangirnagar University), M’Pharm (Jahangirnagar University)
Master’s in Public Health (American International University Bangladesh)
Linde Gas whitepaper 'Drug discovery advances inextricably linked to specialt...Linde Gas Benelux
The development and commercialisation of new medical drugs is a complex and costly process, but increasing pressure is being placed on drug companies to accelerate the timeline from discovery of new drugs, through to clinical trials and then to their release onto the market. The pharmaceutical industry continues to demand ever more advanced products aimed at improving health and the quality of modern-day life.
Research and development take place in pharmaceutical laboratories using analytical instruments such as gas chromatographs with multiple detectors, liquid chromatographs coupled with mass spectrometers (LC-MS), ultraviolet/visible (UV/VIS) spectrometers and nuclear magnetic resonance (NMR) spectrometers. These significantly accelerate research and development cycles, ultimately bringing beneficial drugs onto the market faster than ever before. The effective operation of these instruments depends on the use of the appropriate gases or gas mixtures.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Enhancing Performance with Globus and the Science DMZGlobus
ESnet has led the way in helping national facilities—and many other institutions in the research community—configure Science DMZs and troubleshoot network issues to maximize data transfer performance. In this talk we will present a summary of approaches and tips for getting the most out of your network infrastructure using Globus Connect Server.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
Meta analysis of molecular property patterns and filtering of public datasets of antimalarial hits and drugs
1. Meta-analysis of molecular property patterns
and filtering of public datasets of antimalarial
“hits” and drugs
Sean Ekins a,b,c,d,* and Antony J. Williams e
a
Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA
94010. Tel: 215-687-1320; E-mail: sekins@collaborativedrug.com,
ekinssean@yahoo.com
b
Collaborations in Chemistry, 601 Runnymede Avenue, Jenkintown, PA 19046, USA;
c
Department of Pharmaceutical Sciences, University of Maryland, Baltimore, MD, USA.
d
Department of Pharmacology, Robert Wood Johnson Medical School, University of
Medicine & Dentistry of New Jersey, Piscataway, New Jersey 08854, USA;
e
Royal Society of Chemistry, 904 Tamaras Circle, Wake Forest, NC-27587.
1
2. Summary
Neglected infectious diseases such as tuberculosis (TB) and malaria kill millions of
people annually and the oral drugs used are subject to resistance requiring the urgent
development of new therapeutics. Several groups, including pharmaceutical companies,
have made large sets of antimalarial screening hit compounds and the associated bioassay
data available for the community to learn from and potentially optimize. We have
examined both intrinsic and predicted molecular properties across these datasets and
compared them with large libraries of compounds screened against Mycobacterium
tuberculosis in order to identify any obvious patterns, trends or relationships. One set of
antimalarial hits provided by GlaxoSmithKline appears less optimal for lead optimization
compared with two other sets of screening hits we examined. Active compounds against
both diseases were identified to have larger molecular weight (~350-400) and logP values
of ~4.0, values that are, in general, distinct from the less active compounds. The
antimalarial hits were also filtered with computational rules to identify potentially
undesirable substructures. We were surprised that approximately 75-85% of these
compounds failed one of the sets of filters that we applied during this work. The level of
filter failure was much higher than for FDA approved drugs or a subset of antimalarial
drugs. Both antimalarial and antituberculosis drug discovery should likely use simple
available approaches to ensure that the hits derived from large scale screening are worth
optimizing and do not clearly represent reactive compounds with a higher probability of
toxicity in vivo.
2
3. Introduction
Neglected infectious diseases such as tuberculosis (TB) and malaria kill over two million
people annually 1 while estimates suggest that over 2 billion individuals are infected with
Mycobacterium tuberculosis (Mtb) alone 2. These statistics represent both enormous
economic and healthcare challenges for the countries and governments affected while
these diseases are generally not the focus for large pharmaceutical companies.
Subsequently, research on these neglected diseases in general, and malaria in particular,
is largely comprised of the disjointed efforts of many academic and other non-profit
laboratories distributed across the globe. These many independent efforts, while
providing significant contributions, often lack the project management, data handling,
and pipeline integration functions that are critical to efficiently discovering, developing
and bringing new drugs to market. These are generally integrated functions found in the
pharmaceutical industry, alongside many researchers experienced in drug development.
In recent years non-profit organizations have stepped into the void to manage, coordinate
and fund such efforts. Such organizations include the Medicines for Malaria Venture
(http://www.mmv.org/), the TB Alliance (http://www.tballiance.org/home/home.php) and
the drugs for neglected diseases initiative (http://www.dndi.org/). Pharmaceutical
company contributions to these efforts, while not necessarily negligible, are rarely shared
publicly until development issues halt project development. We are however seeing more
partnering with non-profits to take clinical candidates into large clinical trials and share
the associated burden of costs. There have been recent developments in providing the
neglected disease community with both collaborative tools and databases to integrate
3
4. drug discovery efforts together into effective virtual pharmaceutical organizations that
can efficiently deliver drug candidates for further development 3-5. The urgency to
develop new drugs is obvious as antimalarial resistance has led to a re-emergence of the
disease in areas once controlled. Of particular concern are the chloroquine resistant
(CQR) Plasmodium strains, which has resulted in an increase in malaria mortality 1. Even
the artemisinins are subject to resistance as noted on the Thai-Cambodia border and has
lead to new World Health Organization guidelines 6.
The efforts around screening for neglected diseases like malaria and TB have, in
recent years, significantly increased to the point that very large datasets from hundreds of
thousands to over a million compounds in some cases are now routinely tested 7-10. These
datasets have led to the assessment of what molecular properties may be used to
parameterize hits or lead compounds in the case of TB 5, 11. For example, in a previous
study we have compared actives and inactives against Mtb in a dataset containing over
200,000 compounds 5. The mean molecular weight (357 ± 85), logP (3.6 ±1.4) and rule of
5 alerts (0.2 ± 0.5) were statistically significantly (based on t-test) higher in the most
active compounds, while the mean PSA (83.5 ± 34.3) was slightly lower compared to the
inactive compounds for the single point screening data 5. To date we have assessed 15
different datasets for TB extracted from publications, obtained from screening groups or
generated through our own manual annotation of the scientific literature and patents 11.
These compounds include known drugs against Mtb as well as screening hits and leads.
Our most recent analysis for TB used a dataset consisting of 102,633 molecules screened
by the same laboratory against Mtb 11. We were able to analyze the molecular properties,
differentiate the actives from the inactives and show that the actives had statistically
4
5. significantly (based on t-test) higher values for the mean logP (4.0 ± 1.0) and rule of 5
alerts (0.2 ± 0.4), while also having lower HBD count (1.0 ± 0.8), atom count (41.9 ± 9.4)
and lower PSA (70.3 ± 29.5) than the inactives 11. While two recent landmark studies 9, 10
have provided large datasets of antimalarial compounds that were broadly described as
drug-like this can have a broad definition 12-18 and in one case the drug-like compounds
were suggested to be larger and more hydrophobic than the starting screening collection
(an average molecular weight of 446 and logP of about 5.0 9). As fundamentally obvious
as this would appear to anyone from the pharmaceutical industry, we are not aware of any
similar comprehensive analyses of physicochemical properties across multiple datasets
performed on compounds screened for activity against Plasmodium falciparum or other
plasmodium species. This type of meta-analysis is likely to be more revealing than
analysis of a single dataset. Knowing the optimum physical properties would at least
allow academic researchers to focus their efforts on screening compounds as close as
possible to the desired values using calculations that can be readily performed. However
it is important to note that as with any rules, guidelines or filters there may be compounds
that break them that are still of interest, e.g. large antibacterials, prodrugs, active
metabolites etc 19, 20.
We have also applied chemical rules as filters to the hit molecules against Mtb
which are widely used by pharmaceutical companies to enable removal or flagging of
undesirable molecules, false positives and frequent hitters from HTS screening libraries
as well as select compounds from commercial vendors 21. Examples of such widely used
substructure filters include REOS from Vertex 15, filters from GSK 22, BMS 23 and Abbott
24-26
. These filters in particular pick up a range of undesirable chemical substructures such
5
6. as thiol traps and redox-active compounds, epoxides, anhydrides, and Michael acceptors.
Reactivity can be defined as the ability to covalently modify a cysteine moiety in a
surrogate protein 24-26. One group has recently developed a series of over 400
substructural features for removal of Pan Assay INterference compoundS (PAINS) from
screening libraries 27. While such filters are widely available to the pharmaceutical
industry researchers to readily screen 100,000’s of compounds there is no capability for
academics to access all these rule sets and screen large libraries. Even the recently
available Smartsfilter website resource
(http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter) used in this study, allows a
maximum of 5000 compounds. With the recent publication and open availability of
several sets of malaria hits 9, 10 in ChEMBL28, PubChem 29 and CDD 4 it was decided to
analyze them based on available filters and molecular descriptors to evaluate whether
there were any common features. In addition we compared the malaria hits and datasets
screened against Mtb 11, to potentially develop a further understanding of the influence of
physicochemical properties on compounds with activity against these neglected diseases.
Experimental Methods
CDD Database
The development of the CDD database (Collaborative Drug Discovery Inc. Burlingame,
CA) has been described previously in detail with applications for collaborative malaria
research4.
6
7. Datasets
Screening datasets were collected and uploaded in CDD TB from sdf files and mapped to
custom protocols (Table 1) (see: http://www.collaborativedrug.com/register) 11. The
malaria data were obtained as previously described 9, 30. We have also used the
Microsource US Drugs database (http://www.msdiscovery.com/).
Descriptors
The various datasets were compared using simple calculated molecular properties
including logP, hydrogen bond donor, hydrogen bond acceptor, Lipinski rule of 5 alerts,
polar surface area, molecular weight, rotatable bonds, and atom counts, calculated using
the Marvin plugin (ChemAxon, Budapest, Hungary) within the CDD database. Datasets
with molecular properties were readily exported from the CDD database to sdf files and
excel files for use with other statistical or modeling software (see below).
SMARTS Filters
The Abbott ALARM 24, Glaxo 22 and Pfizer LINT SMARTS (also called the Blake
filters, http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter?help=TRUE 31) filter
calculations were performed through the Smartsfilter web application, kindly provided by
the Division of Biocomputing, Dept. of Biochemistry & Molecular Biology, University
of New Mexico, Albuquerque, NM,
(http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter). This software identifies the
number of compounds that pass or fail any of the filters implemented. Each filter was
evaluated individually with each set of compounds.
7
8. Results
Three datasets of antimalarial screening hits were evaluated with both simple molecular
properties calculated in CDD using Chemaxon 4 (Table 1) and also multiple filters for
undesirable features using the Smartsfilter website (Table 2) incorporating rules widely
used by at least 3 pharmaceutical companies. Additional datasets of drugs and
antimalarial compounds were used as comparators.
Molecular property analysis of antimalarial datasets
The GSK antimalarial hits dataset 9 stands out from the other datasets in terms of
physicochemical properties (Table 1). The mean molecular weight, logP and number of
rotatable bonds are much higher than in the St. Jude 10 and Novartis datasets of
antimalarial compounds 28. The GSK dataset is much closer to the mean property values
for the subset of 165 FDA drugs from the Johns Hopkins University set of compounds
screened against several drug targets 32-35 that were more active against malaria (although
the standard deviations around these properties are very large compared to the other
datasets). The St Jude and Novartis antimalarial compound datasets have almost identical
mean molecular properties which are much closer to the widely accepted values for
“lead-like” compounds (MW < 350, logP< 3) 36, 37 compared with the GSK data.
Filtering the antimalarial datasets for undesirable compounds.
The GSK, St Jude and Novartis datasets have very high failure rates with the Abbott
Alerts 24, 26 (75- 85%) and Pfizer Lint filters (40-57%) (Table 2). The failures with the
GSK filters 22 are generally lower as seen previously (< 7.5%) 11, 30. The subset of 165
8
9. active antimalarial compounds in the Johns Hopkins dataset has an enrichment of filter
failures compared to the total Johns Hopkins dataset of drugs. What stands out for the
Johns Hopkins set is the much lower percentage of failures with the Abbott filters (63%)
which is close to the total drug dataset or the Microsource drugs dataset (Table 2). It
would appear that a general trend for those compounds active against malaria across all
datasets is the high level of failures relative to the various pharmaceutical company filters
and, in particular, the Abbott filters. This may be antimalarial mechanism related or a
limitation in the starting libraries used. The latter is more likely as there are 3
independent datasets as well as the set of compounds that includes many FDA drugs from
Johns Hopkins.
Surprisingly, a set of 14 FDA approved widely used antimalarial drugs
(amodiaquine, amopyroquine, artesunate, atovaquone, proguanil, chloroquine,
halofantrine, hydroxychloroquine, mefloquine, pentaquine, primaquine, pyrimethamine,
quinacrine and quinine) has properties much closer to the St Jude and Novartis hits
(Table 1). These compounds had fewer failures with the Abbott filters when compared to
the GSK, Novartis and St. Jude datasets. This suggests that the mean molecular descriptor
values and filter failure profiles for at least 2 out of the 3 large malaria active compound
datasets are close to known drugs, and these may be focused on as more desirable in
future screening campaigns and for lead optimization.
Discussion
We have previously analyzed the GSK dataset of antimalarial compounds alone (Table 1)
11, 30
and highlighted the high percentage that fail the Pfizer and Abbott filters and
9
10. compared it with a set of US FDA drugs from the Microsource database (Table 1), the
38
Mtb active compounds and other literature examples . Many companies avoid
compounds that have reactive groups prior to screening and the availability and use of
such filters is common. This is not however the case in academia (where the research in
neglected diseases is predominantly performed) unless you have access to core
cheminformatics resources. Similarly, academic groups rarely analyze the calculated
physicochemical properties of the libraries of compounds tested which would allow them
to focus on a narrower range and improve their chances of finding active compounds that
are better optimization starting points (with a lower probability of failure). The GSK
screening hits are described as large and very hydrophobic 9 which others would suggest
as presenting a significant solubility and absorption challenge 17. These mean molecular
properties were not “lead-like” but were closer to “natural product lead-like” rules 39
which is in marked contrast to the GSK paper 9 which describes the compounds as “drug-
like”. We suggested that these GSK antimalarial hits are also vastly different to the mean
11
molecular properties of compounds that have shown activity against Mtb , which are
generally of lower molecular weight, less hydrophobic and have fewer rotatable bonds 5.
Our further analysis using two additional large datasets of antimalarial
compounds and FDA approved drugs tested for antimalarial activity, as well as known
FDA approved drugs, suggests that the GSK data may represent a more difficult starting
points for lead optimization. For example, the GSK dataset 9 has mean molecular weight,
logP and number of rotatable bond values that far higher than those in the St. Jude 10 and
28
Novartis datasets of antimalarial compounds evaluated in this study. Interestingly the
St Jude and Novartis datasets have almost identical mean molecular properties that are
10
11. closer to desirable “lead-like” characteristics 36, 37
. While all the antimalarial datasets
(GSK, St Jude and Novartis) have very high failures with the Abbott Alerts (Table 2),
this is perhaps a point of concern when compared to the FDA approved drugs or FDA
approved antimalarials, as it indicates that all of these datasets of recently screened
compounds have a high percentage of potentially thiol reactive compounds. A recent
analysis by us suggests that compounds known to cause drug induced liver injury also
have a relationship with these types of filters such that they can be used as a partial
predictor for this toxicity (data not shown). Compounds failing the Abbott alerts may
have a high probability of failure and toxicity. As stated earlier, the antimalarial
mechanism of action may require such reactive compounds, however historically, out of
14 FDA approved widely used antimalarial drugs much lower numbers of filter failures
were seen. This suggests that it is possible to develop antimalarials that pass the filters.
Out of the 3 openly available datasets, the St Jude and Novartis hits are closer to the ideal
starting points for lead optimization as defined by others. One suggestion from this
combined work is that such reactivity filters or rules should be more widely instituted for
groups working in neglected diseases before they embark on large library screening so
that they may be alerted to potential false positives beforehand. The data we have
provided on pharmaceutical rule failures are currently not available at any of the website
repositories which host these 3 antimalarial datasets, however in one case we have
suggested how they might be added into the CDD database 11, but an alternative may be
via linkage to the Smartsfilter website. One deficit we have noticed is the Smartsfilter
website does not identify which substructures failed, instead just a pass or fail score is
associated with a molecule. Undoubtedly knowing why a compound failed would be
11
12. instructive. As the neglected disease screening datasets are further evaluated, it is likely
that such filtering results will be useful for others and should ideally be stored alongside
the screening data.
Conclusion
Within a short space of time three large screening datasets of antimalarial hits
have become openly available and hosted in three well known databases and we are also
seeing deposition in other databases like ChemSpider. This offers the availability of
further calculated properties and links to other information that are unavailable at any of
the other databases. Two of these antimalarial datasets have been provided by
pharmaceutical companies (GSK and Novartis) and this represents something of a
breakthrough in releasing data to the neglected disease research community. To our
knowledge there has been no collective analysis of these data from either a molecular
properties or undesirable features perspective. This is important before further resources
are put into optimization of any of the resulting hits. We, and others, have already
described how important it is not only to ensure the quality of any data made available to
21, 30
the research community including chemical structure verification , but also the
chemical properties that can identify potentially undesirable problems with molecules
whether this be poor solubility or toxicity etc. While others have identified problems in
40 41-47 48
other sets of compounds caused by aggregation , false positives or artifacts in
screening libraries these can be pre-filtered and it is not appropriate that the screeners
should remain ignorant of such liabilities any longer. The weight of evidence from the
datasets we have evaluated suggests that although FDA approved drugs are not ideal, the
most conservative filter in the form of the Abbott alerts used in this study routinely fails a
12
13. larger percentage of the compounds in the antimalarial hit datasets than in known drugs
or antimalarial compounds and this should be of concern. We have also seen a similar
pattern with hits against Mtb also failing a very high percentage of these alerts (81-92%)
11
compared to known Mtb drugs (54%) . While the approximately 13,500 GSK
9 30
compounds have higher calculated mean molecular weight and logP , it is clear that
the Novartis and St Jude datasets are much closer to the mean values of the Mtb actives.
This would suggest to us that these libraries may also be quickly repurposed or, at the
very least, prioritized for screening against Mtb (after filtering of reactive compounds) as
they cover similar molecular property space. We have previously described how
computational models can be used to enrich screening libraries with Mtb actives and
5, 11
enable more efficient screening and identification of hits . The addition of
physicochemical property and reactive compound alerts filtering will also be useful
selection criteria for compounds to follow up.
Large compound libraries screened against Mtb and P. Falciparum show that
active compounds have higher mean molecular weights and logP values 5, 9, 11 and, in the
majority of cases, the overlap in these values is near identical. Compounds screened
against P. Falciparum have a high proportion of compounds that fail the Abbott filters
for reactivity when compared to drugs and antimalarials which is in agreement with our
observations for compounds active against Mtb 11 and these compounds should be
carefully studied before further optimization. Understanding the chemical properties and
characteristics of compounds used against Mtb and malaria may assist in the selection of
better compounds for lead optimization.
13
14. Abbreviations: CDD, Collaborative Drug Discovery; GSK, GlaxoSmithKline; HBA,
hydrogen bond acceptor; HBD, hydrogen bond donor; RBN, rotatable bond number;
Acknowledgements
The authors thank Dr. Jeremy Yang and colleagues (University of New Mexico) for
kindly providing access to the Smartsfilter web application and Dr David J. Sullivan
(Johns Hopkins University) for providing the dataset of drugs tested against Malaria. We
gratefully acknowledge the many groups that have provided antimalarial datasets. S.E.
acknowledges colleagues at CDD for developing the software and assistance with large
datasets and our collaborators.
Competing interests: Sean Ekins is a consultant for Collaborative Drug Discovery Inc.
on a Bill and Melinda Gates Foundation Grant#49852 “Collaborative drug discovery for
TB through a novel database of SAR data optimized to promote data archiving and
sharing” He is also on the advisory board for ChemSpider. Antony Williams is employed
by the Royal Society of Chemistry which owns ChemSpider and associated technologies.
14
15. Table 1. Mean ± SD of molecular descriptors from the CDD database for the malaria and drug datasets. MW = molecular weight,
HBD = Number of Hydrogen bond donors, HBA = Number of Hydrogen bond acceptors, Lipinski = Rule of 5 score, PSA = polar
surface area, RBN = Number of rotatable bonds. Molecular properties were calculated using the Marvin plug-in (ChemAxon,
Budapest, Hungary) within the CDD database. *The analysis for the GSK dataset is in press 30 and has been compared to Mtb active
datasets in a separate study 11.
Dataset MW logP HBD HBA Lipinski rule PSA (Å2) RBN
of 5 alerts
GSK data (N = 13,471)* 478.2 ± 114.3 4.5 ± 1.6 1.8 ± 1.0 5.6 ± 2.0 0.8 ± 0.8 76.8 ± 30.0 7.2 ± 3.4
St Jude (N = 1524) 385.3 ± 71.2 3.8 ± 1.6 1.1 ± 0.8 4.9 ± 1.8 0.2 ± 0.4 72.2 ±29.3 5.2 ±2.3
Novartis (N = 5695) 398.2 ± 105.3 3.7 ± 2.0 1.2 ± 1.1 4.7 ± 2.1 0.4 ± 0.7 74.7 ± 37.9 5.6 ± 3.0
Johns Hopkins All FDA 349.1 ± 355.8 1.2 ± 3.4 2.4 ± 4.6 5.1 ± 5.5 0.3 ± 0.8 96.0 ±139.8 5.4 ± 9.6
drugs (N = 2615)
Johns Hopkins Subset > 458.0 ± 298.6 2.2 ± 2.7 2.1 ± 3.4 5.4 ± 4.7 0.6 ± 0.9 90.6 ± 104.4 7.1 ± 7.7
15
17. Table 2. Summary of SMARTS filter failures for various datasets. The Abbott ALARM
24
, Glaxo 22 and Blake 31 SMARTS filter calculation were performed through the
Smartsfilter web application, Division of Biocomputing, Dept. of Biochem & Mol
Biology, University of New Mexico, Albuquerque, NM,
(http://pangolin.health.unm.edu/tomcat/biocomp/smartsfilter). The GSK malaria
screening datawas obtained 9 from the CDD database. The St Jude malaria data was
obtained from 10. The Novartis dataset was obtained from ChEMBL 28. We also used the
Microsource US Drugs dataset as a reference set of “drug-like” molecules. Large datasets
> 1000 molecules were fragmented into smaller sdf files before running through this
website. *The analysis for the GSK and Microsource datasets is in press 30 and has been
compared to Mtb active datasets in a separate study 11.
Dataset (N) Number failing Number Number failing
Abbott ALARM failing Pfizer Glaxo filters 22
filters 24 (%) LINT filters (%)
* (%)
GSK Malaria hits. 10124 (75.8) 7683 (57.5) 129 (0.01)
(13,355)*
St Jude 1291 (84.7) 621 (40.7) 83 (5.4)
(N = 1524)
Novartis 4542 (79.7) 2371 (41.6) 169 (7.5)
(N = 5695)
Johns Hopkins –All 1442 (53.5) 1264 (46.9) 401 (14.9)
17
18. FDA drugs tested
against malaria (N =
2615)
Johns Hopkins Subset 104 (63.0) 91 (55.2) 41 (24.8)
> 50% malaria
inhibition at 96h (N =
165)
Microsource US FDA 688 (66.1) 516 (49.6) 143 (13.7)
drugs (N = 1041)
Antimalarial drugs (N 8 (57.1) 8 (57.1) 2 (14.3)
= 14)
*Originally provided as a Sybyl script to Tripos by Dr. James Blake (Array Biopharma)
while at Pfizer and also known as the Blake filter
http://pasilla.health.unm.edu/tomcat/biocomp/smartsfilter?help=TRUE.
18
19. References
1. D. A. Fidock, Nature, 2010, 465, 297-298.
2. T. S. Balganesh, P. M. Alzari and S. T. Cole, Trends Pharmacol Sci, 2008, 29,
576-581.
3. S. Nwaka and R. G. Ridley, Nat Rev Drug Discov, 2003, 2, 919-928.
4. M. Hohman, K. Gregory, K. Chibale, P. J. Smith, S. Ekins and B. Bunin, Drug
Disc Today, 2009, 14, 261-270.
5. S. Ekins, J. Bradford, K. Dole, A. Spektor, K. Gregory, D. Blondeau, M. Hohman
and B. Bunin, Mol BioSystems, 2010, 6, 840-851.
6. Editorial, The Lancet, 2010, 375, 956.
7. J. A. Maddry, S. Ananthan, R. C. Goldman, J. V. Hobrath, C. D. Kwong, C.
Maddox, L. Rasmussen, R. C. Reynolds, J. A. Secrist, 3rd, M. I. Sosa, E. L. White
and W. Zhang, Tuberculosis (Edinburgh, Scotland), 2009, 89, 354-363.
8. S. Ananthan, E. R. Faaleolea, R. C. Goldman, J. V. Hobrath, C. D. Kwong, B. E.
Laughon, J. A. Maddry, A. Mehta, L. Rasmussen, R. C. Reynolds, J. A. Secrist,
3rd, N. Shindo, D. N. Showe, M. I. Sosa, W. J. Suling and E. L. White,
Tuberculosis (Edinburgh, Scotland), 2009, 89, 334-353.
9. F.-J. Gamo, L. M. Sanz, J. Vidal, C. de Cozar, E. Alvarez, J.-L. Lavandera, D. E.
Vanderwall, D. V. S. Green, V. Kumar, S. Hasan, J. R. Brown, C. E. Peishoff, L.
R. Cardon and J. F. Garcia-Bustos, Nature, 2010, 465, 305-310.
19
20. 10. W. A. Guiguemde, A. A. Shelat, D. Bouck, S. Duffy, G. J. Crowther, P. H. Davis,
D. C. Smithson, M. Connelly, J. Clark, F. Zhu, M. B. Jimenez-Diaz, M. S.
Martinez, E. B. Wilson, A. K. Tripathi, J. Gut, E. R. Sharlow, I. Bathurst, F. El
Mazouni, J. W. Fowble, I. Forquer, P. L. McGinley, S. Castro, I. Angulo-
Barturen, S. Ferrer, P. J. Rosenthal, J. L. Derisi, D. J. Sullivan, J. S. Lazo, D. S.
Roos, M. K. Riscoe, M. A. Phillips, P. K. Rathod, W. C. Van Voorhis, V. M.
Avery and R. K. Guy, Nature, 2010, 465, 311-315.
11. S. Ekins, T. Kaneko, C. A. Lipinksi, J. Bradford, K. Dole, A. Spektor, K.
Gregory, D. Blondeau, S. Ernst, J. Yang, N. Goncharoff, M. Hohman and B.
Bunin, Molecular bioSystems, 2010, In press.
12. G. Chen, S. Zheng, X. Luo, J. Shen, W. Zhu, H. Liu, C. Gui, J. Zhang, M. Zheng,
C. M. Puah, K. Chen and H. Jiang, Journal of combinatorial chemistry, 2005, 7,
398-406.
13. V. V. Zernov, K. V. Balakin, A. A. Ivashchenko, N. P. Savchuk and I. V. Pletnev,
J Chem Inf Compu Sci, 2003, 43, 2048-2056.
14. Y. Takaoka, Y. Endo, S. Yamanobe, H. Kakinuma, T. Okubo, Y. Shimazaki, T.
Ota, S. Sumiya and K. Yoshikawa, J Chem Inf Comput Sci, 2003, 43, 1269-1275.
15. W. P. Walters and M. A. Murcko, Adv Drug Del Rev, 2002, 54, 255-271.
16. I. Muegge, S. L. Heald and D. Brittelli, J Med Chem, 2001, 44, 1-6.
17. C. A. Lipinski, J Pharm Toxicol Methods, 2000, 44, 235-249.
20
21. 18. Ajay., W. P. Walters and M. A. Murcko, J Med Chem, 1998, 41, 3314-3324.
19. M. P. Gleeson, J Med Chem, 2008, 51, 817-834.
20. C. A. Lipinski, F. Lombardo, B. W. Dominy and P. J. Feeney, Adv Drug Del Rev,
1997, 23, 3-25.
21. A. J. Williams, V. Tkachenko, C. Lipinski, A. Tropsha and S. Ekins, Drug
Discovery World, 2009, 10, Winter, 33-38.
22. M. Hann, B. Hudson, X. Lewell, R. Lifely, L. Miller and N. Ramsden, J Chem Inf
Comput Sci, 1999, 39, 897-902.
23. B. C. Pearce, M. J. Sofia, A. C. Good, D. M. Drexler and D. A. Stock, J Chem Inf
Model, 2006, 46, 1060-1068.
24. J. R. Huth, R. Mendoza, E. T. Olejniczak, R. W. Johnson, D. A. Cothron, Y. Liu,
C. G. Lerner, J. Chen and P. J. Hajduk, J Am Chem Soc, 2005, 127, 217-224.
25. J. R. Huth, D. Song, R. R. Mendoza, C. L. Black-Schaefer, J. C. Mack, S. A.
Dorwin, U. S. Ladror, J. M. Severin, K. A. Walter, D. M. Bartley and P. J.
Hajduk, Chem Res Toxicol, 2007, 20, 1752-1759.
26. J. T. Metz, J. R. Huth and P. J. Hajduk, J Comput Aided Mol Des, 2007, 21, 139-
144.
27. J. B. Baell and G. A. Holloway, J Med Chem, 2010, 53, 2719-2740.
21
22. 28. K. Gagaring, R. Borboa, C. Francek, Z. Chen, J. Buenviaje, D. Plouffe, E.
Winzeler, A. Brinker, T. Diagena, J. Taylor, R. Glynne, A. Chatterjee and K.
Kuhen, ChEMBL-NTD (www.ebi.ac.uk/chemblntd)
29. D. L. Wheeler, T. Barrett, D. A. Benson, S. H. Bryant, K. Canese, V. Chetvernin,
D. M. Church, M. DiCuccio, R. Edgar, S. Federhen, L. Y. Geer, W. Helmberg, Y.
Kapustin, D. L. Kenton, O. Khovayko, D. J. Lipman, T. L. Madden, D. R.
Maglott, J. Ostell, K. D. Pruitt, G. D. Schuler, L. M. Schriml, E. Sequeira, S. T.
Sherry, K. Sirotkin, A. Souvorov, G. Starchenko, T. O. Suzek, R. Tatusov, T. A.
Tatusova, L. Wagner and E. Yaschenko, Nucleic Acids Res, 2006, 34, D173-180.
30. S. Ekins and A. J. Williams, Drug Disc Today, 2010, In Press.
31. J. F. Blake, Medicinal chemistry (Shariqah (United Arab Emirates)), 2005, 1,
649-655.
32. C. R. Chong and D. J. Sullivan, Jr., Nature, 2007, 448, 645-646.
33. C. R. Chong, J. Xu, J. Lu, S. Bhat, D. J. Sullivan, Jr. and J. O. Liu, ACS chemical
biology, 2007, 2, 263-270.
34. C. R. Chong, X. Chen, L. Shi, J. O. Liu and D. J. Sullivan, Jr., Nat Chem Biol,
2006, 2, 415-416.
35. C. R. Chong, D. Z. Qian, F. Pan, Y. Wei, R. Pili, D. J. Sullivan, Jr. and J. O. Liu,
J Med Chem, 2006, 49, 2677-2680.
36. T. I. Oprea, J Comput Aided Mol Des, 2002, 16, 325-334.
22
23. 37. T. I. Oprea, A. M. Davis, S. J. Teague and P. D. Leeson, J Chem Inf Comput Sci,
2001, 41, 1308-1315.
38. P. Axerio-Cilies, I. P. Castaneda, A. Mirza and J. Reynisson, Eur J Med Chem,
2009, 44, 1128-1134.
39. J. Rosen, J. Gottfries, S. Muresan, A. Backlund and T. I. Oprea, J Med Chem,
2009, 52, 1953-1962.
40. J. Seidler, S. L. McGovern, T. N. Doman and B. K. Shoichet, J Med Chem, 2003,
46, 4477-4486.
41. G. M. Rishton, Curr Opin Chem Biol, 2008, 12, 340-351.
42. G. M. Rishton, Medicinal chemistry (Shariqah (United Arab Emirates)), 2005, 1,
519-527.
43. T. I. Oprea, C. G. Bologa, S. Boyer, R. F. Curpan, R. C. Glen, A. L. Hopkins, C.
A. Lipinski, G. R. Marshall, Y. C. Martin, L. Ostopovici-Halip, G. Rishton, O.
Ursu, R. J. Vaz, C. Waller, H. Waldmann and L. A. Sklar, Nat Chem Biol, 2009,
5, 441-447.
44. K. E. Coan and B. K. Shoichet, J Am Chem Soc, 2008, 130, 9606-9612.
45. B. Y. Feng, A. Simeonov, A. Jadhav, K. Babaoglu, J. Inglese, B. K. Shoichet and
C. P. Austin, J Med Chem, 2007, 50, 2385-2390.
23
24. 46. A. Jadhav, R. S. Ferreira, C. Klumpp, B. T. Mott, C. P. Austin, J. Inglese, C. J.
Thomas, D. J. Maloney, B. K. Shoichet and A. Simeonov, J Med Chem, 53, 37-
51.
47. A. K. Doak, H. Wille, S. B. Prusiner and B. K. Shoichet, J Med Chem.
48. C. Schmidt, Nat Biotechnol, 2010, 28, 185-186.
24